Week 0: Introduction

February 2, 2025

Hello! I’m Rajat Rawat and welcome to my senior project. I will explore how Large Language Models (LLM) can transform disaster response by bridging language barriers during disasters.

“What language barriers” you may be asking. Imagine living through a hurricane and tornado and not understanding the primary language of the area. Clear communication is vital in life or death situations, yet many non-English speakers face this challenge in disasters.

Enter my project. I’m evaluating the capabilities of two popular LLMs (GPT-4-Turbo-mini and Llama 3.1) to answer disaster-related questions in English and Spanish. I specifically plan to test if these models can avoid “hallucinations” (making up false information) by staying grounded in given information and providing accurate guidance during extreme wind events.

Disasters don’t discriminate. Hurricanes, tornadoes, and other extreme events affect diverse populations. However, most US disaster response systems cater to English speakers. By exploring LLMs’ bilingual abilities, I hope this is a major step toward creating a multilingual chatbot that could help save lives.

Additionally, my work inspiration stems from the fact that disaster response as a field doesn’t always get enough funding, particularly for innovation. FEMA has run out of funding nine times during disasters since 2001, but projects like these can improve innovation and help organizations improve their response capabilities. My passion for the field began last summer, where I had the opportunity to attend MIT’s Beaver Works Summer Institute, learning the integrations of computer science in disaster response.

With guidance from my external advisor, Samuel Scheele, who works as an associate engineer at MIT Lincoln

Laboratory’s Humanitarian Assistance and Disaster Relief Systems department and my internal advisor, Ms. Rangoli, who was my computer science teacher for several years, I’m confident we will push this project to fruition.

Over the next few months, I’ll:

Build a Dataset: Using CrisisNLP datasets, I’ll gather and filter disaster-related questions from real tweets made during hurricanes and tornadoes, translating them into Spanish with DeepL (an accurate AI translator).
Test the Models: Prompt LLMs with these questions in both English and Spanish, evaluating their accuracy and relevance.
Analyze Results: Score their responses based on whether they avoid hallucinations and provide helpful, actionable advice.

With the help of a native Spanish speaker, I will also spot check 25% of DeepL’s translations to ensure they are accurate.

Stay tuned for this exciting adventure!

View more of Rajat R.'s posts.

Week 0: Introduction

Reader Interactions

Leave a Reply Cancel reply