Week 4: Abandoning NeuralODEs
March 22, 2024
Hey folks, welcome back to my humble blog…and this week…we have SUCCESS. I have a lot to share, so let’s get into it.
Because last week was such a struggle, I spent the entirety of Monday studying the basics of deep learning through the textbook Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. This book is a fully comprehensive guide on deep learning, including linear algebra, probability, history and development of AI, information theory, and numerical optimization. Because of this, the textbook was super dense and it took a lot of time to read—but it definitely was definitely worthwhile as it helped me gain a better understanding of the neural network structure. Admitted, my other motivation for going through this textbook is because I wanted to learn more about deep learning.
Tuesday is the start of the trial and error process, and just like last week, it was tough. I tried to tinker with a pytorch lightning neural network model to fit an exponential equation but no matter what hyper parameters I used, the graph always looked like a flat line that suddenly asymptotes. Next, I decided to try to tinker with the NeuralODE model I implemented last week to fit a sine function but the graph the program produced was nothing close to a sine function. At this point, I was suspecting that NeuralODEs was not actually helping me accomplish my goals, so instead I used a lightning model to fit a sine function. This worked for short ranges only and did not have a high accuracy.
On Wednesday, I discovered that NeuralODEs were unbeneficial to my project goal. But I did not know that until my meeting with my external advisor. We figured out that NeuralODEs are simply a generalization of a residual network. The way residual networks solve ODEs is basically the same as using numerical methods to solve it. NeuralODEs are like residual networks except the depth of the neural network is modeled to be continuous instead of discrete. NeuralODEs are good at modeling data in which the correspondence between input and output is an unknown ODE, such as weather and fluids. This is not helpful to my project as I am trying to solve for a known and given system of ODEs. Though disappointing, it was able to point me in the right direction.
After running into many dead ends, I found this lecture: https://compphysics.github.io/MachineLearning/doc/pub/week43/html/week43-bs.html
With this resource, I was able to implement a neural network with one hidden layer to solve an exponential decay and growth differential. The ODE is given by dy/dx = x. This method differed from my past method because instead of generating data, it instead using a loss function that is modeled off of the given differential equation. The loss function is not calculated between the predicted y and the analytical y, but instead between the predicted y and the derivative, dy/dx. This derivative is calculated through calculating a gradient of the neural network. Because of the accuracy of this method, I plan on generalizing this method for a system of ODEs as well.
While there were many dead ends, I was finally able to implement a working program for a simple ODE. I don’t expect the rest of this project to be easy, however I have a clear plan as to how I am going to get it done.
That is all the progress I made this week, so see you guys next time!
Leave a Reply
You must be logged in to post a comment.