Week 3 - You win some you lose some
March 15, 2024
Hello everybody and welcome back to Week 3 of my Senior Project blog! This week, I tried to run my ARIMA model using my data from last week and accumulate my “predictions” of stock prices with no external influence. Unfortunately, I could not get my ARIMA model to work and output forecasts, but in this week’s post, I’ll detail the process I went through trying to run this statistical model.
First, I wanted to run the model on Google Colab, and I was able to run part of it (remember the images from last week where I visualized my data). However, when I tried to run my forecasting function, the platform kept crashing and returning the following error message:
In other words, I had two options – either pay for Colab Pro or find another way to run my program.
After doing some research, I found that I could use my laptop’s terminal to run larger Python programs, so that’s what I decided to do. However, I had never used my terminal to run programs, so I didn’t know how to run this model. After installing Python to run the programs and Anaconda to deploy the program, my laptop’s terminal was able to run the program, however, many error messages still displayed. The ARIMA model that I was making had a running time of 20 minutes. Let me explain how it works. First, the model reads the CSV file and outputs a graph of the visualized data. However, after it outputs a graph of the closed data, it waits for the user’s input to close the graph as a sign to continue. Then, it analyzes the data and creates a forecast of stock prices and outputs a graph for the forecasts. But guess what, I had no idea that the program WAITS for the user to close the graph. Wednesday night, when I first ran this program at 1:00 in the morning, after spending the entire day figuring out how to run a Python program on my laptop’s terminal, I was overjoyed to see that first graph – a visualization of my data. “Okay, this is gonna work this time,” I thought. I patiently waited for 20 minutes to pass, expecting to see a second graph with my forecasts, but that never happened. 20 minutes passed….then 40….and suddenly it was past 2:00 in the morning. I decided to go to sleep and check back again the next morning, and to my surprise, the model was still running 10 hours later! How can a model that takes 20 minutes to run still be running after 10 hours!?!?! At this point, I was a mixture of confusion, stress, worry, and could not figure out what was wrong with my code. At this point, I was trying to run my model on multiple devices, in hopes that one would work. Then, Thursday afternoon, my dad walks into my room, looks at my code, and simply states “Oh, you have to close the first graph.”
Now, I had a model that was finally working, but if you thought it would be smooth sailing from there, you are wrong. Now that I had finally found a way to run my model, I could finally deal with the myriad of errors that arose when I tried predicting stock prices for these companies. In my program, I utilize a module known as datetime. This module can essentially manipulate the date/time of the data I enter. For example, it can read my data points (which are a list of dates – essentially just numbers) and determine when a new month or year starts. This helps greatly with graphing data, as this module can organize data by its date or time. It was an essential part of my forecasting model because it helped me set the increments at which I wanted to forecast stock prices (daily). Well, when I ran my model, it turns out, this module was deprecated. This meant that it no longer existed in newer versions of Python. From here, I could either reprogram my model with a different module to support the newer Python versions, or I could figure out how to get access to an older version of Python.
I decided to use my sister’s old laptop to run my program. This laptop is very old, and does not have a working battery, so I couldn’t disconnect it from power. And by this point, you know the drill. It had an older version of Python installed, so I downloaded my dataset and program and ran it with that. While the program was able to run, it provided me with a brand new list of errors that looked something like this:
Let me explain this picture. On the left is the date that I am trying to predict (or at least some of them as I couldn’t fit everything in one screenshot. On the right is supposed to be a forecasted price, but instead it says NaN, which is an error code, which is definitely not supposed to happen.
What exactly does this mean and how do I fix it? I guess you’ll have to wait until next week to find out!
Leave a Reply
You must be logged in to post a comment.