Week 5: More Efficient than EfficientNet?
March 28, 2024
Hello fellow consciousness — have you ever wondered, is it really you reading this blog or is it just that blob of consciousness floating around in the fixed region that is your skull?
Ok, finally I’m gonna reveal the juicy data that proves that my AI model is better than the big, bad (bad meaning good) EfficientNet model invented by big, bad computer scientists (if you have no clue what I’m talking about, take a look at my Week 4 Donkey blog, trust me it’ll make much more sense).
Anyways.
Boom. Table. Trainable parameters are basically the parameters in an AI model that are trainable! If you imagine a machine learning model to be a brain, the trainable parameters are essentially the number of neurons (or perhaps inter-neural connections) in the brain. More neurons means smarter right? Yes…technically, BUT, that doesn’t necessarily mean EfficientNet is better because it has more trainable parameters and is therefore “smarter.”
More trainable parameters also means that the model is more clunky or that the brain is heavier. Imagine if you wanted to exploit a brain to calculate “1+1” (a calculation that honeybees can allegedly do — https://www.cnn.com/2019/02/08/health/honeybees-learn-math-study-trnd). You wouldn’t want to carry around a 3-pound human brain just so you can calculate “1+1” would you? You would much prefer a conveniently-sized 2-milligram honeybee brain! Less trainable parameters means a more lightweight model that is faster to train and utilize.
Imagine, for a moment, a heckling computer scientist yelling, “BET YOUR MACHINE LEARNING MODEL IS AS DUMB AS IT IS LIGHTWEIGHT! ”
Bam. Graph. On the y-axis is the true positive rate or TPR (technically it’s detection completeness on previous confirmed asteroid detections from the Zwicky Transient Facility) and on the x-axis is the false positive rate or FPR. True positive rate is essentially my model’s ability to recognize real streaks, while false positive rate is how often my model is dumb and thinks an image is a streak when it’s not.
You might be wondering, why is it forming a line? Shouldn’t each model just be a point, with a fixed TPR and FPR? And that’s when I applaud you as a reader for paying such close attention and being so acute.
The reason why I can plot a line is because the output of the machine learning models is not simply true or false, it’s actually a decimal number from 0 to 1. Now, I can set a “threshold” let’s say at 0.6, where any number outputted above 0.6 will be a positive while any number outputted below a 0.6 will be a negative. If I set the threshold at 0, then my model will successfully identify all real streaks, with a TPR of 1, while labeling all images without streaks as having streaks in them, meaning a FPR of 1 as well. If I set the threshold at 1, on the other hand, then my model will never successfully identify a real streak, with a TPR of 0, while labeling all images without streaks correctly, meaning a FPR of 0 as well. Taking a range of different thresholds will generate a line plot, called the ROC curve, and that’s what’s displayed in the graph.
An accurate model will maximize TPR while minimizing FPR. And that corresponds to a curve that stretches as close to the top-left corner as possible. So a “higher” curve on the ROC graph means a more precise model.
And guess what? You know which model is “higher” and stretches the most to the top-left corner? The blue one. The self-built one. And you know which model doesn’t stretch as far to the top-left corner? EfficientNet. Mic drop.
My model is more lightweight and more accurate than EfficientNet. End of Proof. QED. ATCGW.
Next week, I’ll be talking about — actually, I don’t know what I’m gonna be talking about. Maybe life’s not about figuring everything in the future out. Maybe it’s just about living in the moment, enjoying the pure, refreshing flow of time — until it slips by us, ever so silently.
Leave a Reply
You must be logged in to post a comment.