Week 2: So What Exactly Is Transcriptome Sequencing?
March 19, 2023
Thanks for coming back to my blog!
Last time, we discussed the central dogma of molecular biology and how genetic information “flows” between DNA, RNA, and proteins.
In this post, I will go over the specifics of the planned research, including the methods, goals, expectations, and potential problems.
But first, let’s cover some background information. In a previous project I conducted last year, I was the first person to sequence the genome of the freshwater angelfish. The genome is the collection of all the DNA molecules in the cells of the organism. Sequencing the genome means I determined the order of the nucleotides that make up the DNA molecules in the fish’s cells. (For those who are inclined, you can read my full publication here.) The final genome assembly consisted of 15,486 contigs (fragments) and was 734.79 Megabases in size with an 86.5% BUSCO score (completeness score). Functional annotation of the genome revealed 24,247 protein-coding sequences related to other fish species. 14,329 (59%) of the identified genes were orthologous to Archocentrus centrarchus, a closely related South American cichlid.
Aims
In my current senior project, I plan to expand on my previous work by sequencing the transcriptome of the freshwater angelfish. This novel data will help me answer the question: what mRNAs and non-coding RNAs does the freshwater angelfish express, and how does this relate to gene function and structure?
Novelty
There are many reasons why I believe this project is valuable. First, this work is completely novel. No prior research work has sequenced the whole transcriptome of the freshwater angelfish. This means that when I deposit my results to NCBI (National Center of Biotechnology Information), I will contribute completely new information to the scientific community.Second, angelfish have the potential to act as a model organism for future biomedical research. Because scientists cannot experiment on humans, they instead conduct their research on animals like the angelfish and then translate the results to the human body. For example, past research papers have used angelfish and cichlid fish more generally to study memory, craniofacial variation, bacterial infections, and viral infections (Powder and Albertson, 2016; Maruska and Fernald, 2018). However, compared to zebrafish, which is currently the most popular aquatic model organism, the angelfish has yet to be studied as thoroughly. By sequencing the transcriptome of the angelfish and increasing the amount of information available about this fish, scientists will be more likely to use it in their scientific studies. A thorough understanding of the genetics of angelfish is necessary before it can be used as a basis for future research.
Steps for sequencing:
- Extract RNA from Angelfish tissue with an RNA extraction kit.
- Prepare the RNA for sequencing. (ie. reverse transcribe the RNA into cDNA for sequencing).
- Sequence the cDNA with the nanopore Mk1B sequencing deice.
- Basecall the cDNA.
- Analyze data with the Galaxy bioinformatics platform.
- Continue analysis with various open-source tools.
Potential pitfalls
There is very little literature on extracting RNA from angelfish tissue. I might need to try various extraction kits/ methods before I find one that works. I also may not obtain enough sequencing data on my first run. I’ll need to run the sequencing a few times to collect enough data.
Next Steps
In my next post, I’ll detail my RNA extraction methodology including any successes or failures. Thank you for reading my second blog post and I hope you’ll join me for my next one!
Sources:
Madireddy, I. 2022. First Ever Whole Genome Sequencing and De Novo Assembly of the Freshwater Angelfish, Pterophyllum scalare. microPublication Biology. 10.17912/micropub.biology.000654.
Maruska KP, Fernald RD. 2018. Astatotilapia burtoni: A Model System for Analyzing the Neurobiology of Behavior. ACS Chem Neurosci 9: 1951-1962.
Powder KE, Albertson RC. 2016. Cichlid fishes as a model to understand normal and clinical craniofacial variation. Dev Biol 415: 338-346.