Week 4 – Audio Alteration
March 31, 2023
Overview
This week, I devised methods to build onto my colorization model as well as looked into methods for my audio alteration model. However, since I am still in the process of finding adequate libraries for audio alteration, I’ll keep this week’s post short and only focus on the fundamentals of my approach.
Audio Alteration
The entire basis of this model rests on a subject called signal processing, which enables us to visualize data that one can usually only directly observe. For sound, one can record audio from a certain file as a time series of frequencies and amplitudes.
Within signal processing, there are essentially two different ways of which to record data: time domain and frequency domain. Time domain graphs are what we are used to. They plot some variable against the independent variable of time. The graph below shows the fluctuation of a sound’s amplitude over time. Simple enough to understand.
The frequency domain, however, is a bit more complicated. Rather than an independent variable of time, the independent variable is frequency. The graph plots how much of a signal resides within each band of frequency across a range of different frequencies. One can transform time domain graphs into frequency domain graphs through something called Fourier Transformations. The below graph shows the above graph transformed into a frequency domain.
These transformations prove vital for audio alteration. Given a time series of a sound signal, one can use Fourier Transformations to break down this signal into the different frequencies that make it up. Therefore, one can analyze the different sounds and voices within a signal. By separating these sounds, we gain much more control over the signal and can eliminate certain frequencies to clean the audio up.
Next Steps
Next week, I will attempt to train my colorization model over more training images in hopes of better performance. As for my audio alteration, I will continue searching for libraries to carry out Fourier Transformations and begin working on code that I can hopefully share within a week.