Week 2: Here's the Drill
March 8, 2024
Hi everyone, and welcome back! For week two, I will go over the process for creating my virtual dentist chatbot. I had the chance to start creating a virtual dentist chatbot as part of my capstone project for Advanced Java Topics. However, there is still a long way to go before I complete the development of my large language model. So, let’s get into it!
Current Progress
As of submission of my capstone final, I successfully created a chatbot that checks whether the patient is included in the patient records. If so, it then checks for any previous treatments that the user has undergone.
First, I created two lists of sample data in the form of patient information and patient history using a Google spreadsheet. I am unable to add pictures right now, but essentially, the patient_information dataset contains basic information on patients’ ID numbers, Date of Birth, contact numbers, and the date of their last scheduled appointment. The patient_history dataset includes their ID number and any past treatments they’ve had (such as root canals, wisdom tooth extraction, and crown treatments). When the patient enters their ID number, my model will traverse through the list of patient history and return the specific treatments and/or procedures the patient has had in the past.
In order to successfully program this chatbot, I created four classes. The first class is patient_info, which reads the patient_info.csv file and then displays this data. The next class is patient_history, which reads the patient_history.csv file and then displays this data. The information displayed includes the patients’ ID number and whether they’ve had dental surgery, crown treatment, wisdom tooth extraction, or root canal treatment. Next is analyze_data, which checks whether the patient’s inputted ID number is found in the patient records. If the patient is included in the dentist’s records, the chatbot will check whether the patient has had dental surgery, crown treatment, wisdom tooth extraction, or root canal treatment. If so, the model returns the treatments the patient has had. Finally, the run_chatbot, which is the driver class, asks the user to input their ID number. Then, it calls the analyze_data class and patient_history csv file to analyze the patient’s past procedures.
Methodology
This week, I began creating a custom GPT using OpenAI’s GPT-4 by feeding it instructions to focus on dental issues and preventative care. To provide some context, GPT-4 is a form of generative AI that utilizes large language models. It is then able to read large amounts of text to understand and generate human-like text. In the next week, I hope to finish constructing this custom GPT and install it into Google Colab, where I will combine it with my current chatbot.
While conducting my research, it may take time to acquire my data set since large amounts of data are required in order to train and test my model successfully. When creating an AI model, it is imperative to have large amounts of data to improve generalization. If the model is trained with a small amount, this may cause issues as it will be unable to adapt to new data, causing it to be inaccurate. Additionally, it will adapt to complex data, as when my model is implemented in the real world, it will have to deal with situations that it may have never seen before.
However, training the model with large amounts of data with variability will improve representation so it is able to deal with new scenarios. This is important because, since we are dealing with patient diagnosis, it is important that the algorithm is accurate in order to protect the health of the patients. To create an ethical chatbot that protects consumers, I will find and integrate an EHR (electronic health record) to safely access patient records for implementation and then split this up into training and testing data.
I will use a list of patient history, medical records, and the patient’s stated problem as input. With this training set, the model will be able to accurately diagnose by learning the proper treatments. Finally, to implement my algorithm while continuously regulating my model, I will use the testing data and work closely with medical professionals to ensure the product is effective and performing well.
That’s all for Week 2. See you all next week!
Leave a Reply
You must be logged in to post a comment.