Week 10: Final touches
May 9, 2025
Hello and welcome to the final edition of my blog!
This week has been very eventful, as I finish up testing in preparation for my final product. I’ve been tailoring the model for specific use cases, simulating end users and their behaviors. One example I’ve been focusing on is a pipeline for practice examples. I failed to think about this earlier, since such a conversation would be two-step and require recent memory of the last turn of the conversation, but also wouldn’t need the entire memory. Thus, I’ve programmed in a few user-friendly modes, so that the user can see some likely use cases right off the bat.
Additionally, I’ve realized that the AP data is too big to be efficiently fed in as one document. The CED is over 540 pages long, and a bit of preprocessing actually went a long way for the model. I’ve split up the CED into different chunks based on unit and structure so that the LLM has a better idea of what it’s parsing. This will hopefully help with retrieval. I’ve also been experimenting with different retrieval modes from LLM-based search to keyword search or even TF-IDF.
I’ve spent this last week adding these features and fixing bugs as they come up. I’m excited to present at the symposium and show you all what I’ve been working on!
Leave a Reply
You must be logged in to post a comment.