Week 5: Data Implants
March 30, 2024
Hello everyone, and welcome to Week 5 of my senior project. Thank you all so much for following along! Let’s get into this week’s updates as I provide a timeline for completing my project as well as uncover the journey of finding my dataset.
Timeline:
Before starting this project, I created a timeline as shown below of what I hope to accomplish weekly:
Week One- Begin reading Lee’s The AI Revolution in Medicine: GPT-4 and Beyond.
Week Two- Finish reading Lee’s The AI Revolution in Medicine: GPT-4 and Beyond. Begin reading Wilkerson’s The Shift.
Week Three– Finish reading Wilkerson’s Revolution in Medicine: GPT-4 and Beyond. Begin work with Dr. Kalai to find the EHR dataset.
Week Four- Finish reading all scholarly sources. At this point, with the help of Dr. Kalai, I should be equipped with the knowledge to begin developing my chatbot. Assemble the data into csv files.
Week Five- Begin creating an outline of the steps needed to create the chatbot. Work with Dr. Kalai to ensure the dataset is accurate.
Week Six- Integrate a custom GPT into Google Collab and add the patient information.
Week Seven- Split the data into training and testing data. Work on training the LLM with the patient history data (p1).
Week Eight- Work on training the LLM with the patient history data (p2).
Week Nine- Use the testing data to ensure the model is accurate.
Week Ten- Work with Dr. Kalai to ensure that the chatbot is providing accurate treatments. Fix any errors that arise.
Week Eleven– Improve the chatbot to make it more user-friendly. If time permits, work on adding an option for patients to schedule an appointment with their dentist.
Week Twelve- Last changes to chatbot + ensure functionality.
Originally, I had planned for more time for research but upon reading up on numerous scholarly articles on creating custom GPTs and the use of Generative AI in medicine, I had gotten a kickstart on my project. Here is an updated timeline of what I have accomplished as well as what I hope to accomplish within the next couple of weeks:
Week One– Begin research on clinical and evidence-based decision making
Week Two- Go over the methodology and steps to create my AI model
Week Three- Begin creating a custom GPT using OpenAI’s GPT-4 tools.
Week Four- Fix the bugs in my custom GPT in order to make a more user-friendly chatbot.
Week Five- Begin finding an open-sourced EHR dataset.
Week Six- Finish implementing the EHR data.
Week Seven- Integrate a custom GPT into Google Colab. Work on training the LLM with the patient history data (p1).
Week Eight- Work on training the LLM with the patient history data (p2).
Week Nine- Use the testing data to ensure the model is accurate.
Week Ten- Work with Dr. Kalai to ensure that the chatbot is providing accurate treatments. Fix any errors that arise.
Week Eleven- Improve the chatbot to make it more user-friendly. If time permits, work on adding an option for patients to schedule an appointment with their dentist. Create a slideshow for the final presentation.
Week Twelve- Last changes to chatbot + ensure functionality. Finish the slideshow for the final presentation.
Acquiring the Dataset:
This week, I focused on acquiring my patient dataset. As mentioned in previous blog posts, I wanted to find an Electronic Health Record (EHR) to ensure accuracy when implementing my model in the real world. However, since it is difficult to use actual data from medical professionals due to confidentiality, I diverted my efforts to finding an open-source EHR.
I came across a Dental Practice Management Software on Github, called DentneD, where I could find data on patient history, appointments, reports, billing information, and other files that may be found in a dentist’s office.
These features are important since they provide me with access to the information I need (and more) to successfully create my AI model. However, all of the files were C#, which meant that I couldn’t use GoogleColab to directly import the data. Therefore, I used VS Code but ran into a problem with the Licensing agreements since when I executed the files, it would output “Restricted Access”. This is the error that I am getting on Terminal:
In order to successfully solve this issue, I will download the Microsoft SQL server (as highlighted above).
Conclusion
Thanks for reading this week’s blog post! Next week, I hope to gain access to the open-source database through MSSQL and use this information to continue creating my model. Stay tuned for updates as I implement my EHR data on Colab. Let me know if you have any questions!
Leave a Reply
You must be logged in to post a comment.