Week 5: Unveiling Insights - A Journey through Data Analysis and Automation
April 1, 2024
A deep analysis of my data is possible with K-means. But before I delve into my successes of this week, I must address the issues I encountered. I created two datasets: one without leaks and one with leaks. While removing leaks from the dataset without leaks, I encountered a problem. I was deleting groups from the same database I was iterating through, resulting in heavily skewed future model numbers. My data removal process was eliminating unnecessary groups but retaining leaks in some cases. I wasn’t able to directly identify this error until I attempted to print the maximum count within the groups. To rectify this, I had to create a list of leaks and simply remove them post-iteration through the larger dataset. Furthermore, in earlier iterations of K-means, I wasn’t processing my data. While time values like month, day, year, hour, minute, and second do not need processing, the water gallons meter reading does. It can significantly impact K-means clustering, which likely obscured a clear understanding of the data’s distribution. The clusters weren’t distinct earlier. However, I normalized my data using min-max normalization to reduce its influence to just a 1-reading difference.
Now, onto the analysis. After printing the values within the K-means clusters, I began to compare them with my own recorded data and noticed two things. First, the groups were accurately formed. Each group closely resembled the manually recorded data, indicating that this automated grouping process worked effectively, eliminating the need for human-made decisions, which is advantageous for unsupervised learning processes. Second, wash groups were categorized as wash groups, laundries as laundries, and few outliers existed. The leaks were also prominently apparent.
The next step in this project is to determine how to automate the grouping of any data to its corresponding appliance, eliminating the need for manual analysis.
Reader Interactions
Comments
Leave a Reply
You must be logged in to post a comment.
Cindy Z. says
What was your data removal process and why was it removing the wrong groups?