Week 5: Extracting Representations And Layer Details From Denoising AutoEncoders
Welcome to Week 5 of my Senior Project Blog! This week, I will cover how I was able to extract the internal representations and layer weights of Denoising Autoencoder (DAE) models that were built in our lab for Chromosome-22.
As we saw previously in prior weeks, DAEs are unsupervised artificial neural networks that convert data to an efficient representation in latent space using encoder layers, and subsequently, reconstruct the original data back from the latent space using decoder layers. Typically autoencoders have an hourglass shape with a bottleneck layer that holds the latent representation. This architecture obviates the need for direct copying of data between encoder and decoder portions so the network can learn the most efficient representation of the input while disregarding the noise or temporary patterns.
As we saw in Week 4, one method to characterize autoencoders is based on the hidden/intermediate layer or latent space dimensions. The bottleneck layer in the autoencoder is applied by restricting the dimension of the input via the hidden/intermediate layers of the encoder. In the case of DAEs introducing noise in the input data reduces the effective space in which the latent space representations can be located. DAE models need to balance sensitivity to the inputs while accurately building a reconstruction. They also need to be insensitive enough to avoid memorizing them or overfitting the training data. Therefore, autoencoder design poses a trade-off between maintaining only the variations in the data required to reconstruct the input and avoiding redundancies within the input. By penalizing the network according to a loss function, AE model can learn the most important attributes of the input data and how to best reconstruct the original input with the smallest latent encoded state. Such networks are called Under Complete Autoencoders. An alternative approach where there is no reduction in the number of nodes at our hidden layers but the loss function is constructed to penalize activations within a layer results in Sparse Autoencoders. This means that for any sample, the network learns an encoding and decoding that only relies on activating a small fraction of neurons in each layer. This means that hidden layer nodes in a sparse autoencoder become sensitive toward specific attributes of the input data while an under complete autoencoder will use all nodes for every sample. In this way, a sparse autoencoder limits the capacity of a network to memorize the input without limiting its ability to extract features from the data.
Now that we understand the different architectures of autoencoders and the tradeoffs involved, I explored each tile DAE model for the 256 fragments of Chromosome-22. Extending the print model Python function, I was able to output the characteristics of the network and understand the sizes of the last encoder layer (which outputs the latent representation) and the last decoder layer weights (which outputs the reconstructed input). I was able to accomplish this by sorting the keys of the state_dict in reverse order and picking the first item in the sorted list and loading the corresponding tensor representing the weight of this layer as shown below:
Furthermore, I also created a couple of wrapper functions to get the last encoding and decoding layers for any AE model. This will help me save only the weights for this layer for further analysis such as dimensionality reduction.
The output of a few samples of tiles is shown below:
Interestingly, most of the 256 tiles did not have the bottleneck layer or hourglass architecture and were most likely sparse encoders where the number of nodes in the hidden layers of the encoder and decoder segments did not change.
Now that I have learned how to get the layer weights for any DAE model, I will next focus on whether more compressed representations are possible using techniques of dimensionality reduction.
Thank you for reading, and see you next week!
- Dias, R., Evans, D., Chen, S.-F., Chen, K.-Y., Loguercio, S., Chan, L., & Torkamani, A. (2022). “Rapid, reference-free human genotype imputation with denoising autoencoders”. ELife, 11. https://doi.org/10.7554/elife.75600.
- TorkamaniLab. “Imputation_autoencoder/Example.vcf TorkamaniLab/imputation_autoencoder .” GitHub, 25 May 2022, https://github.com/TorkamaniLab/Imputation_Autoencoder/blob/master/test/example.vcf.