Week 2: The Speed Wall
March 5, 2026
Welcome back to my senior project blog! Last week I finished constructing all three models and was eager to jump straight into the training phase. My plan was to start with the QNN first since it is the most complex and unpredictable of the three, get it working well, and then breeze through the two classical CNNs afterward. That plan sounded great on paper. In practice, the quantum model decided to humble me.
When I first attempted to train the QNN on the full 10-digit MNIST classification task, the accuracy was, to put it politely, rough. The model was struggling to meaningfully distinguish between all ten digit classes when each image is compressed down to a 4×4 binarized grid mapped onto 16 qubits. The loss curves were not converging the way I needed them to, and the predictions felt almost random for several of the digit classes. This was not entirely shocking given how much information is lost when you crush a handwritten digit into 16 binary pixels, but it still meant I could not just accept the results and move on. I started experimenting with adjustments to the circuit, tweaking the number of entangling layers, adjusting the learning rate, and modifying how the Parameter Shift Rule interacts with the optimizer to update the quantum circuit’s trainable parameters.
Here is where the real problem hit. Every single one of those adjustments requires retraining the model from scratch, and each training run on the QNN is agonizingly slow. The fundamental issue is that I am simulating a quantum computer on classical hardware, so every quantum gate operation, every state vector update, and every gradient calculation has to be brute-forced through classical linear algebra. A single epoch takes orders of magnitude longer than it would for either CNN. When I was just building and testing the circuit with small batches during Week 1, the slowness was manageable. Now that I am trying to iterate rapidly on a real training set, each failed experiment costs me hours of waiting. I would tweak a hyperparameter, start a run, watch a progress bar crawl, and then discover the change did not help. That debugging loop I described last week is exponentially more painful when every attempt takes half a day to evaluate.
Because of this bottleneck, I have not yet trained the two classical CNNs either. There is no point in collecting their baseline numbers until I have a QNN that actually works well enough to make the comparison meaningful, and I cannot get the QNN to that point when every iteration takes this long. The solution I am currently pursuing is procuring access to a faster server. Google Colab, even with GPU acceleration, is not cutting it for the volume of quantum simulation I need. I have been working on getting access to a more powerful machine that can handle the computational overhead more efficiently, compressing those multi-hour runs into something manageable so I can actually iterate at a reasonable pace.
Despite the frustration, this week has been a grounding reminder of why the quantum computing field is pouring billions into building real quantum hardware. Simulating even 16 qubits at scale already pushes classical machines to their limits. Next week, I am hoping to have the new server set up so I can break through this wall and finally get all three models trained on clean data. Stay tuned.
Reader Interactions
Comments
Leave a Reply
You must be logged in to post a comment.

Hi Patrick. I really enjoyed reading your updates this week and hearing about the challenges you had to navigate with training the models. While I’m not completely familiar with QNNs, a lot of what you described reminded me of the struggles I encountered developing machine learning models. From your post, it sounded like a lot of time went into manual hyperparameter tuning. I was curious whether the computational cost of running a single epoch limited other tuning approaches you might have tried, such as grid search or other automated methods. I’m looking forward to your future blogs!