Week 1: Deepfake Meteorites
February 29, 2024
Hello fellow inhabitants of this pearly blue planet! If you’ve forgotten what my project is already due to the extremely wordy and unmemorable title and description, I’m essentially working on finding asteroids that might collide with Earth using machine learning.
In order to detect near-Earth objects (NEOs) with artificial intelligence, we need large amounts of training data to teach the computer algorithm to accurately recognize which images depict asteroids and which images do not. For this project, I’m using data downloaded from the Zwicky Transient Facility (ZTF), which is a telescope observatory based down in Caltech. ZTF data is quite special in two ways — it has a relatively short exposure time of 15-30 seconds and it surveys the entire night sky every two days. This makes it optimal for identifying asteroids across multiple nights of data and extrapolating a more accurate orbit from the detections.
Oh, and that brings up another topic — how will I determine if the asteroid will collide with Earth after I detect it in telescope exposures? Using a program called find_orb, I simply plug in the positions of the observations and the program will attempt to fit orbit parameters to the detections (with lower error when more detections are inputted). So yes, theoretically I could detect a civilization-ending asteroid and yell at NASA to redirect the asteroid and save the world (but highly unlikely).
Anyways, back on topic, training AI to find spooky, scary NEOs. A common tactic in computer vision is to have a large set of real, human-labeled training dataset to teach algorithms what in an image is a bicycle and what’s not, for example. But, unfortunately for us, NEOs are rare and hard to find, meaning that there is a limited number of real detections to train an algorithm on. In addition, if we were to use previous detections as training data, the algorithm would be biased to detect brighter NEOs as those are easier to spot and identify. The algorithm would not be exposed to as many fainter and harder-to-detect NEOs and would ultimately fail to detect these objects in real data.
But, fortunately for us, when NEOs pass by quickly in telescope exposures, they leave a streak that have a nice, predictable spread of brightness, or more quantitatively defined as the Gaussian Point Spread Function. Using this already defined distribution, we can simulate streaks, effectively creating deepfakes of asteroids to generate more training data (though technically it’s not generated with deep learning). How smart! This week, I’ve been implementing this streak simulation, and its mostly working. I am making improvements to the efficiency of the simulation algorithm so that it can generate large training datasets.
Until next week! Sayonara!
Leave a Reply
You must be logged in to post a comment.