Week 6: Simple System of ODEs
April 12, 2024
Hey folks, welcome back to my blog. This week was a big mix of frustration and successes so let’s get into it.
On Monday, I attempted to debug my neural network model which calculated loss between the estimated solution and the gradient of the neural network, calculated using the pytorch auto gradient function. As I found out last week, this was difficult because my assumption that the pytorch autograd and numpy autograd worked the same way was inaccurate. I created a new jupyter notebook to try out all of the autograd features of both libraries on the same functions in order to find the difference. And it turns out, numpy autograd works on functions while pytorch autograd does not. Additionally, pytorch autograd works on pytorch functions such as the pytorch loss functions and the forward function for pytorch neural networks. However, if I multiply these functions by another variable, the entire thing does not work. For example, let’s say we have a forward function that takes in one parameter x, forward(x). Now let’s say we set a new variable A = x * forward(x). If we call autograd on A, this causes an error, which is a huge problem because that is the exact gradient I need to find to calculate the loss in each training step. I will need to find a work around.
On Tuesday, I attempted solving a simple system of ODEs: dy1/dx = y1, dy2/dx = y1 – y2, using the data fitting method. I created a pytorch data set which generated estimated y1 and y2 values iteratively using small steps in dx. After checking this generated data with the analytical solutions, I created a neural network with one input, x, and 2 outputs, y1 and y2. I had 5 hidden layers with the ELU activation function and used the MSE loss function. Then I started the training…and watched the loss after each epoch oscillate. That’s odd. This means that the loss was not converging. I stopped the training and lowered the learning rate to combat this but it did not work. I decided to just let it run for a few hundred epochs and check the results. Upon checking the results, I was surprised to find that the results are pretty accurate even though the MSE loss was in the thousands. It was close but not accurate enough for my liking. My guess is that there are a lot of local minimums in the loss function and the neural network is jumping between them, not able to find a suitable minimum. Tinkering with the depth and breadth of the neural network did not seem to help. I may need to try something else.
On Wednesday, I got sick of the slow training speed so I decided to set up some GPU acceleration. I downloaded CUDA and thought that I could automatically use GPU acceleration on my neural network. But NOPE. Tons of bugs. Man. First, pytorch was not even able to detect the gpu device. I tried to use the command prompt to troubleshoot but it turns out the command prompt couldn’t detect the python language. I found, painfully, that because I downloaded python from python.org instead of the Microsoft store, it was added to the path incorrectly. When I finally fixed that, I had to fix the issue where CUDA was not being detected by pytorch. I fixed this by redownloading pytorch with the CUDA computing option. GPU acceleration at first did not make the training process faster but once I tested it on a bigger neural network, I was definitely able to see a difference. I found that this is because GPU excels and parallel computing.
Because training was sped up, on Thursday I was able to tinker with the structure more efficiently. However I was not able to get any result from hours of trial and error. I think this is a dead end unless I find a significantly different structure. Frustrated, I decided to make my coding workspace more convenient by using VSCode instead of Jupyter Lab’s web compiler. I also decided to download Cygwin to learn Linux as that is good coding practice and a standard industry tool for file management and more.
Friday. HAPPY BIRTHDAY TO ME. Yep, today is my birthday. I am finally an adult. Maybe with my adult brain, I can be more successful with this project. Today I tried my data fitting with a second order system of ODEs, and realized that it doesn’t work. Which means either I am missing something, or that I need to switch to the autograd method. I think instead of getting stuck, I am going to try the autograd method.
That is all the progress I made this week, see you guys next time!
Leave a Reply
You must be logged in to post a comment.