Week 8: Result Analysis
This week, I graphed and analyzed the results of my graph. I was able to do so using the tool Wandb. Wandb gives me a chance to view the results of my run in real time, as shown in the images below.
Model performance over iterations of training.
The LSTM model was trained for four thousand iterations, and performance metrics including cost function and F1 score were calculated both on training and validation data. Cost on training and validation data is shown in panels a) and b) respectively. The F1 score calculated with training data for true and fake news classifications are shown in panels c) and d) and with validation data in panels e) and f).
RoBERTa model for fake news classification displays poor performance with outcome class bias. For fake news classification, the RoBERTa model failed to generate accurate predictions. After tokenizing each word and mapping it to a number using my own dataset, both the RoBERTa model and the TweetClassifier model were tested. For the RoBERTa model, the output was the same for both a 16 and 64 batch size, and both displayed subpar performance. However, when switching to the TweetClassifier model and using a batch size of 256, performance was much better. Thus, RoBERTa was not a good approach in this scenario.
Custom LSTM model results in testing F1 score performance of 78% and 73% for prediction of fake and true news. Compared to the RoBERTa model, the LSTM model had much better performance. As shown in Figures 3, 4, 6, and 7, the F-score for each graph is around 0.7 to 0.8. Contrastingly, the RoBERTa model failed to show good results.
My findings prove that it is possible to utilize an NLP model to classify news as fake or true based on its title. In this research, the LSTM with a custom word tokenizer gave the best results. The best tokenization approach was creating a vocabulary to number mapping with custom tokenization. Sentence piece tokenization developed by Google did yield good results. The AG-NEWS PyTorch dataset was used to create a dictionary first, but using the Fake News Dataset gave a better dictionary with stronger results. After 4,000 iterations of training, the model performance stabilized.
For the Fake News Detection using Bi-directional LSTM-Recurrent Neural Network, the research found that the RNN model had a vanishing gradient problem that led to weaker results. However, using an LSTM-RNN model had a much higher prediction accuracy. My findings are congruent with the fact that an LSTM model is the best approach for predicting fake versus true news.
In conclusion, my research can be applied to articles on the internet or on websites. If used on article titles for news websites, my model may be able to flag possibly misleading or false content, which would then be examined for fake news. If my research is able to identify misinformation accurately, it could help decrease the amount of fake news on the internet.
Next week, I will begin to explore the implications of the applications of the model I have created, and explore some ways to release it to the public.