Week 1: Literature Review & Coding!
February 27, 2025
Hello everyone! Welcome to the first week of my Senior Project.
This week, I was busy reading research papers, drawing maps of signaling pathways, writing (and debugging) code, and learning how to use the Wynton HPC Compute Cluster (!!) (as well as vibing to music on the BART to UCSF…)
My project has taken me into the vibrant, complex world of the RNA-editing enzyme ADAR1. By reading around 8 original research and review papers this week, I learned that ADAR1 contains two primary isoforms (ADAR1p110 and ADAR1p150), each with a distinct cellular localization, range of functions, and disease implications. I explored recent studies investigating how ADAR1 correlates with obesity (its overexpression inhibits fat cell differentiation, which helps inhibit the progression of obesity) and its role in autoimmune diseases, viral infections, and cancer. I have also learned how ADAR1 interacts with other cellular proteins, notably MDA5, PKR, and DICER.
The papers I read this week illuminate ADAR1’s interactions with different, sometimes seemingly unrelated, proteins, which made it difficult for me to really understand the full picture. So, I embarked on the daunting task of synthesizing these papers to paint a clear (albeit complicated) web of pathway interactions. After a lot of trial and error (many pencil lines were erased and redrawn…) and revision by my mentor, I was able to draw up a map of ADAR1’s complex and multifaceted interactions, which will help me interpret my computational results later on!
My computational project aims to understand ADAR1’s mechanism of action by identifying its RNA-editing sites and RNA-editing-site clusters. Identification of RNA-editing sites from FASTQ files is a multi-step, lengthy process–one look at a never-ending FASTQ file and I knew I was in for a ride. This week, I surveyed multiple statistical packages that can help me do a fraction of this task, including REDItools, SPRINT, and RNAEditor, and I identified their pros and cons. I then explored a more recent tool developed by Torkler et al., called local differential editing index (LoDEI), that improves upon these existing packages in multiple ways–for one, it is a window-based detection approach, which accounts for the fact that ADAR1 differential editing between two experimental conditions may not be at the exact same site. Also, since it directly identifies differential editing sites, manual removal of SNPs is not needed! Wonderful! So, I will endeavor to use this tool to help me with RNA-editing site detection, and then do further analysis on its output files.
What I did this week in the way of coding was mostly data preprocessing and learning to use the compute cluster. Attending UCSF Wynton HPC Cluster office hours on Tuesday, I learned how to set up VSCode in the virtual environment, enabling easy editing/running of code! I generated genome indices for the lab’s mouse reference genome on UNIX–essentially, this means I annotated the genome with its functional characteristics–coding sequences, noncoding sequences, introns, exons, 5’ UTRs, 3’ UTRs…I am in the middle of preprocessing the lab’s knockout and wildtype ADAR1 mouse brown and white fat cell data (128 Gigabytes…it’s taking a while) – what this entails is aligning the raw FASTQ reads to the mouse reference genome to produce BAM files and then sorting these BAM files. Once this code finishes running, I can begin implementing LoDEI.
My PI also introduced me to a new project the lab is doing: understanding significant protein translocations between the nucleus and cytoplasm in response to environmental and chemical stimuli. This connects to my research on ADAR1, as the cellular localization of ADAR1 isoforms is tightly regulated: the isoform I am studying, ADAR1p150, has both a nuclear localization sequence (NLS) and a nuclear export sequence (NES). So, this is another project on my radar–I will work to identify proteins with NLS and NES sequences from a Nature mass spectrometry databank since this can potentially deliver insight into ADAR1’s cellular localization and activity. For now, this project is on the back burner as I work on the ADAR1 RNA-editing sites project, but I’ve started reading mass spectrometry papers to gain background knowledge on the topic.
Until next week!
Leave a Reply
You must be logged in to post a comment.