Week 6: Finding a New Floss-ophy
April 6, 2024
Hi everyone, and welcome to Week 6 of my senior project! This week I tried out different options to acquire my patient dataset. As I mentioned last week, I found a Virtual Dental management software on Github but was not able to execute the files as I needed the Microsoft SQL (MSSQL) server. I started the week trying to install MSSQL but ran into problems that would prohibit me from accessing the patient data. So, let’s dive into the different paths I took!
Option 1: Installing MSSQL
I was facing issues with “Restricted Access” and licensing agreements while accessing the files on Terminal, so I needed to install MSSQL to use the patient management software. To do so, I installed Docker on my laptop but realized some potential problems as all the files would be running through Terminal. This makes it difficult to connect with Colab as I am executing the files on the cloud and not locally.
Therefore, I diverted my efforts to installing MSSQL on Amazon Web Services (AWS). This, however, was far too costly, and unfortunately proved that I couldn’t use the dataset on Github. While this was disappointing, I was determined to find an alternate way of acquiring my patient data.
Option 2: Using ChatGPT Directly
My next resort was to ask the GPT Builder to access specific databases. While there is no dataset for the model to refer to, I wanted to see if it could potentially learn from patient data. First, I configured my GPT to ask the user for their patient ID number before providing assistance. This proved that if prompted, the model could look for a patient ID in a sample dataset to find their medical history.
Additionally, this requires minimal programming as the model can access a dataset when prompted. The issue with this however is that OpenAI cannot access EHR data and therefore cannot be used in professional fields. This led me to my third option.
Option 3: Creating a Sample EHR Dataset
In this option, I will collaborate with Dr. Kalai, my external advisor, to create my own sample dataset and import it on Colab. This not only ensures accuracy of my model but also allows for more programmatic control as I can train the model with sample data and not just feed it data to refer to. This option provides access to a wider range of databases since I’m not limited to using only EHR data, enhancing the performance of my model in practical applications. For the GPTBuilder to access a dataset, it is necessary to establish a connection between Google Colab and the GPTBuilder.
Conclusion
Next week, I hope to have my sample dataset ready to go and connect the GPTBuilder with Colab. Thank you all so much for reading, and follow along to see if Option 3 worked for me! Please let me know if you have any questions!
Leave a Reply
You must be logged in to post a comment.