Week 2: Getting the hang of Pytorch and Neural Networks
March 9, 2024
Hey folks, welcome back to my blog!
This week passed by so quickly, it felt like I wrote my last post only yesterday. Anyways, let’s get into what I have been working on this week!
The first few days, I spent learning linear algebra from both the Linear Algebra MIT Lectures as well as from the book Linear Algebra and Learning from Data. I especially focused on the properties of determinants, eigenvectors, and eigenvalues, which have been very fascinating! Determinants have 10 different properties, of which only 3 are needed to derive the rest. These first three properties are:
- Determinant of the identity matrix is 1
- If you exchange two rows of a matrix, you reverse the sign of its determinant from positive to negative or from negative to positive.
- If we multiply one row of a matrix by t, the determinant is multiplied by t.
When deriving the rest, it gives a very intuitive understanding of what determinants are and how they utilize the symmetries of matrix operations. I won’t get into the rigorous details, but I am glad to finally understand the reasoning behind the seemingly random formula: ad – bc.
Eigenvalues and eigenvectors…I was first curious about them when my cosmology class touched upon it. It was hard to understand at first, but after following all the lectures, I have a better understanding of them. An eigenvalue and eigenvector is defined by: Ax = λx, where λ is the eigenvalue and x is the eigenvector. What this means is that the eigenvector, after being transformed by the matrix, will stay parallel to the original vector while being scaled by the eigenvalue. For a certain matrix, you can find its eigenvalues and eigenvectors by rewritten the above equation into: |Ax – Iλ| = 0, where I is the identity matrix. If the matrix is size n by n, the max number of eigenvectors and eigenvalues will be n. What I found the most interesting is that the sum of the eigenvalues will be equal to the diagonal of the matrix. A lot of times, eigenvectors and eigenvalues are actually really hard to find because eigenvalues can repeat…and there would be a missing eigenvector. Hopefully I won’t encounter them in my project…
Speaking of my project, let’s go over the direct progress I have made this week! Honestly, it’s not a lot because I ran into a lot of trouble while learning Pytorch and implementing neural networks, but that was expected. I went through the basic tensor concepts, nth degree matrixes, operations provided by the Pytorch library, such as flatten, matmul, stack, and more. After familiarizing myself with those, I read the implementation of a neural network. I liked it better than tensorflow because it separated activation functions and the neuron layers, which was easier to understand for me. To familiarize myself with these implementations, I tried to implement a convolutional neural network that can detect handwritten digits. It was a struggle, but in the end I got it to work. I definitely underestimated how long it takes to train a network… Next, I researched what neural network structures can work for ODEs, and decided on trying to use simple RNNs. I have a lot more research to do and a lot more progress to be made over the next few weeks!
That is all I have for you today! If you want to follow my progress, my Git Hub is linked here: https://github.com/Fox-King777/Senior-Project.git
Leave a Reply
You must be logged in to post a comment.