Week 0: Introduction to My Blog
February 15, 2025
Hi everyone! I’m Sonya, and welcome to my blog post, where I will dive deep into the intricate relationship between computer algorithms, genetic data, and foodborne illnesses. This week, I will focus on why understanding and identifying bacterial species quickly and effectively is necessary and how computer algorithms can help streamline this process.
Background:
48 million individuals fall sick from foodborne illnesses every year in the United States, and this statistic only increases as we consider countries across the globe. The presence of bacteria and fungi within our food is a constant issue that farmers struggle to manage and people suffer from; with every bacteria contained, mutated versions are developed to adapt to the changing environment. Therefore, researchers began looking into the applications of genome sequencing to better track these fast-growing changes within bacterial strains.
Whole genome sequencing (WGS) is a comprehensive method to analyze the complete sequence of DNA of different organisms including bacterial organisms, and it has been used in the past to track disease outbreaks like in the 2011 E. Coli O104:H4 outbreak in Europe. Prior research has been done within the field of WGS of bacterial strains, and this has been used within two main fields: healthcare and agriculture. Building strain identification databases has also been a previous focus for many researchers, and one prime example of this was the creation of the NCBI Pathogen Detection Database. It integrated bacterial and fungal pathogen genomic sequences whose sources include food, water, and other contaminated sources. This helped to predict the presence of different pathogens as well as identify their potential traits.
As a biological researcher, I have worked with gene editing in the past using the silencing of certain genes within a bacterial strain, and this includes working with bacterial cells of the E. Coli family. This sparked my interest in better understanding how mutations form within bacterial species, and it inspired my passion to create a computer algorithm to better identify these mutations.
Applications of Computer Algorithms:
For my research project, I want to utilize the established computer algorithms of global alignment to compare the bacterial strain, globally considered the base sequence, with mutated versions. This will provide a base for me to develop an algorithm that identifies the presence and location of mutations, which can be used by researcher more effectively prevent the spread of the bacterial strain within crops.
Project Proposal: https://docs.google.com/document/d/1zR22EYsy9k3DMCtXGuB3C8VasVDbNm-JzTwQWVE9nKc/edit?usp=sharing
On-site Internship Website: https://insinstitute.com/
Reader Interactions
Comments
Leave a Reply
You must be logged in to post a comment.
Hi Sonya!
This is such a cool project—I love how interdisciplinary it is in combining computer algorithms with crop bacteria monitoring. Since it sounds like you’re planning to develop a new mutation-identification algorithm (please correct me if I’m wrong), how are you planning to evaluate its performance/efficiency? Additionally, do similar algorithms exist that you’ll compare it to?
Hi Sophia, thank you so much for your questions! Algorithms like this do exist as this is a problem that bioinformatic and genetic researchers have been tackling for the last decade or two. However, I wanted to identify specifically how these mutations can be identified and then connected to changes in proteins using a specific alignment algorithm. I am planning to evaluate its performance and efficiency by first using sensitivity formulas that measures how well algorithm is detecting actual mutations as well as other methods such as F1 scores and ROC curves.