Week 9: Black magic strikes once again
May 3, 2024
Hello everyone, welcome back to my blog. This week’s blog will be a bit shorter than usual because I have run into yet another roadblock with my code. Let’s get into the abnormalities I’ve observed and how I plan on dealing with them.
Picking up from last week, I noticed that although the loss converges, the neural network was not able to produce the right function. I concluded that the trial solution I based my neural network on was not the right form. So I began tinkering with the form of the equation, even adding trigonometric functions. This never worked, however, even with the loss converging. And this got me thinking. Theoretically, if the loss converges, that means the condition of the differential equation and the initial condition should be satisfied. If it could not be satisfied due to using an incorrect trial solution, the loss should not converge. This means that the initial trial solution was the right one. Then why did it not give the right solution to the system of differential equations?
I went back to the neural network I built to solve differential equations. I extended the range of the input. Big problem. Once I extended the input range from [0, 2] to [0, 10], it suddenly stopped working. Instead of giving me the expected graph, of exponential growth, it gave a graph similar to exponential decay. I tried many ranges, and [0, 2] was the only one that gave me the right solution. The others gave weird almost parabolic curves. This was perplexing and I still do not completely understand it. Another peculiar observation I made was that when I trained the model on [0, 2], it was somehow able to predict beyond the range it was trained on. Things are working and breaking in ways that they…should not be. I tried tempering with the ranges on the neural network model I used to solve the Lorentz system to see if I could get a similar result. I could not.
This could be a very huge issue and could prevent me from finishing this project if I am not able to figure out why these issues occur. However, I have plenty of things I could try. Either there is a logical problem in my code or there is a logical problem in the training of these models. I am first going to assume the latter and train in a mini-batch mode as opposed to the entire dataset at once. Then I am going to go through my code for logical bugs as well as print out the gradients to figure out what is happening. The model is falling into a trap somewhere and learning incorrectly. If somehow this doesn’t work, I’ll have to find another neural network model.
This weekend is going to be a lot of programming. What a difficult problem. Let’s do this.
Thanks for reading my blog and see you guys next week once I finish my final product!
Leave a Reply
You must be logged in to post a comment.