Week 8 – Combined Model

April 28, 2023

Overview

This week, I combined my graphical colorization model with my audio alteration model and tested the final product with some sample videos.

Joint Models

For my final model, I split a sample video into an audio file and a video file (.wav and .avi), and fed each to its respective models. After each model converted each file to some refined version, I joined the two together for the final video. The sample video I used was FDR’s speech in response to the Pearl Harbor attack in 1941. The original video not only uses black-and-white video but also has background static in the audio that is present throughout the whole video. The video attached below is the original video.

The video attached below is the refined video. I had to cut much of the video, so it could be uploaded to this website. However, I preserved the key features I want you to notice.

Next Steps

As you can tell, the new video is still far from perfect. My colorization model performed quite poorly, so for next week, I will try to find a library that is able to do a better job for video colorization and integrate it into my final model. As for sound, my model was able to successfully clarify the audio when one person was speaking. However, it performed poorly when trying to clarify the audience clapping, as it sounded too similar to actual noise. Thus, I will try to find a method where the model ignores sounds that are too difficult to refine and only focuses on sounds with only one or two clear speakers, in an attempt to maintain overall clarity.

View more of Dhanvi G.'s posts.