Week 7: The Neural Network
Right after my last blog post, I discovered a really helpful tool related to what I was planning on doing. It was a Ruby script that claimed to be able to generate Piet files automatically, from any given BrainF program.
Also similar to what I wrote about in that blog post, I thought it would be a good idea to shift my focus again. Now that I had a good script to translate between the languages, I could also change my plans for creating the AI model. Previously, I wanted to finish a prototype of a rule-based Piet-to-BrainF translator, but at that point, it still wasn’t capable of handling loops in Piet (because loops are implemented in a way that makes it hard to quantify what exactly is going on). I thought with this new ruby script, I could first randomly generate up to hundreds or thousands of BrainF programs, and convert them to Piet. (While there is no guarantee these programs won’t be caught in infinite loops, it’s much easier to generate random files that a BrainF interpreter can recognize as programs, compared to the Piet interpreter.) This would create the dataset an AI model could train on. Because of how much data this is, it might be significantly easier to put together a usable model.
I wrote a program to generate random BrainF, and convert it to Piet through this Ruby script. To make training the AI model easier, there were several features I added to the program. One being “marked” and “unmarked” Piet files. “Marked” files are ones that have a small line of white pixels above every new equivalent BrainF operation, where “unmarked” are those without these lines. For every generated BrainF program, there is both a marked and unmarked Piet file generated, each put into separate folders. My thinking is that this may clue a model into seeing the patterns of how each BrainF character represents some specific set of pixels generated by the script. Another feature I added was splitting the random programs into different categories, based on complexity. One set has only four BrainF characters it can pull from (. , + -), the next adds two more (< >), and the one after that adds the most complex BrainF feature, loops ([ ]). My thinking with this is that the model may behave better if I focus the training on solely the first set, since it is the most simplistic.
To create this Neural Network, I chose Keras. After some research. It seemed like a high-level Python library that also allows me a lot of flexibility in designing the network. I set up a Google Collaboratory notebook, and started setting up the model.
I quickly ran into some issues, Keras strongly recommends that image files be a standard size for training, or else the computations are much more inefficient. (I did some reading on this, and it seems that because neural networks rely so heavily on matrix multiplication, and matrix multiplication is only possible in circumstances with specific dimensions (if one matrix is sized M x N, the other must be sized N x K to properly multiply), these neural networks are designed to work on inputs of the same size). But this ruby script generates images of differing sizes, so I have to create some way to make every single generated image the same as the largest image in the set. In another NN project, it might be possible to crop some of the training image instead of expanding it. But because Piet is encoded throughout the length of the PNG image, there is no way to crop without losing parts of the data. So I had to learn how to do this scaling process through reading more about the numpy library, which is responsible for the data architecture underpinning many Python libraries, including Keras.
This section extends the current array (representing the data of a PNG image), with black sections all around the difference between the current file, and the largest possible file in that set. Let’s look at an example, here is the first piece of BrainF text generated: “,,-.+.++.++,.,-+++”
This generates into this image, which is smaller than most of the others generated.
Here are the two pieces of BrainF code that represent the largest Piet images, one is the widest image, and one is the tallest. In that order, “><+->.,+.-><>+.>.” and “-<+[-+[–<—]]”
And here are their associated images:
The wider one has a size of 325×9 pixels, while the taller one is 110×12 pixels. So now I know that all these images will fit in a 325×12 frame, here is what that first Piet image looks like when padded to this size:
With this code, I now have evenly sized images, so the Neural Network will have an easier time training. But now I need to figure out what layers I will use within the model. Traditionally, image-based NNs commonly use convolutional and pooling layers. Pooling layers generally downsample an image and look for individual features. This is useful for regular classification models; for example, say we want a model that can tell whether an image depicts a dog or not. One useful “feature” in this classification could be if there’s something that looks like a tail in the image. A pooling layer would be useful here to focus in specifically on the pixels that would represent a tail, to aid in processing whether the image is of a dog. Many different pooling layers can be used to show many different useful features in an image. If an image shows something with a tail, fur, and no whiskers, it’s more likely to be a dog. Pooling layers are also especially useful for these applications because they find features regardless of position. So a tail could be in any part of the image, and it could still be recognized as a tail. While this is useful for general NN models, these layers aren’t very useful to me, because Piet code is dependent on its position in the image.
This upcoming week, I hope to work more on the model, and reach a point where I can collect together everything I’ve worked on during the project.