Week 1: Creating a Probability Model
March 21, 2024
Hello again, and welcome back to my blog. During the first week of my project, I wanted to start out on designing a probability model to mimic the voting patterns within realistic elections. One of the key aspects of my project is to study the effects of forced ballot truncation in realistic scenarios, and one of the main ways to do so is by accounting for the rate at which voters rank varying numbers of candidates (ie. How many voters rank only 1 candidate vs all of them). The probability model would then come into play while simulating preference profiles, truncating each ballot by varying lengths. Initially, I had planned on obtaining this probability model from previous research on the topic, however, after reading through many journal articles, I came to the conclusion that I would have to craft my own. So, I downloaded all the publicly available data from political elections that had used preferential ballots from the data library: preflib.org. In total, there were 309 elections to be analyzed. Then I created a program to aggregate the data into a spreadsheet that tracks the number of candidates and voters in each election. Lastly, the final field in the sheet stores the number of voters who ranked each number of candidates (ie. 10 voters ranked 1 candidate, 20 ranked 2, 100 ranked all 3).
In the coming weeks I will use this unpackaged data to construct the probability model, which will then be implemented within the simulation of preference profiles. There are a few different ways that I can use the data to create probability models, so I will need to consult with my on-site advisor to determine which method to use. That’s all for this week. Thanks for reading!
Reader Interactions
Comments
Leave a Reply
You must be logged in to post a comment.
Avi L. says
Wow, Jonah, I am quite impressed by your ability to swiftly adapt and then aggregate data from over 300 elections. I have two questions for you. Firstly, how do you plan on making sure that the diversity of these elections represents the large range of voting behaviors? Also, do you think you could elaborate on the different methodologies you are considering in order to develop the probability model you discussed? Overall, very intriguing, and I look forward to learning more about your project in the coming weeks!
Jonah S. says
Thanks for engaging with the blog! To address your first question, the probability model itself is designed to account for the large range of voting behaviors. By basing itself off of data from many elections with real voters, the probability model should mimic realistic voting behavior. I’m still considering how I want to develop the probability model, but I will go over that in next weeks blog. Thanks for the compliments!