Week 1: Introduction
March 4, 2026
Hi everyone. My name is Charan Sridhar, and over the next few months, I’ll be working on a machine learning project focused on nanopore sequencing. I’ll use this blog to document what I’m doing, what I’m learning, and the problems I run into along the way.
Here’s the context. Nanopores are tiny pores that can detect molecules by measuring changes in electrical current as those molecules pass through. The technology was originally developed for DNA sequencing, but researchers have started applying it to other areas like water quality testing. Last summer, I worked with Dr. Cherukumilli at the University of Washington’s SWELL Lab on a project that used nanopores to detect contaminants in water. While I was there, I noticed that the software used to interpret the nanopore signal, called a basecaller, requires a lot of computational power. These models are built to run on high-end GPUs in data centers. That’s a problem if you want to use a nanopore sensor in the field, in a place without reliable internet or powerful hardware.
That’s the problem I’m trying to address. My research question is: how to compress a basecalling neural network without losing meaningful accuracy, so that it can run on low-power devices that can be deployed in the field? I’ll be looking at techniques like model pruning, quantization, and knowledge distillation. Most of the work on basecaller optimizations is about making basecallers faster on high-end GPUs, not on small CPU edge devices.
My hypothesis is that using techniques like quantization and pruning, I will be able to build a compressed model that is accurate enough to be practically useful in the field for certain applications like water quality testing, even if it doesn’t match the performance of a production-level basecaller like Dorado.
Reader Interactions
Comments
Leave a Reply
You must be logged in to post a comment.

I found your project very cool. Using computational models to decipher electronic signals piqued my interest, and I’m curious to see where the best improvements can be made.
I’m really interested in what your benchmark for field testing will be, or what factors you’ll be limiting yourself to when trying to test out your models. Will you be running a test on your computer and reading power usage to see the levels, or something different?
Excited to see how your research progresses!
Hi Charan, I can’t wait to see how efficient your model will get. Of the three compression techniques you mentioned, do you plan to apply them independently first to see how much each contributes to efficiency gain and accuracy loss, or will you combine them from the start and tune from there?