Week 9: Playtesting!
April 30, 2024
Hi everyone and welcome back! This week felt exceptionally long, even though I had already completed the coding last week. Somehow, arranging times to meet with people and have them play the game isn’t as easy as I thought it would be. Besides the struggle to find enough people interested in participating (I need a minimum of 60 for statistical significance and I’m currently at 45), I also need to coordinate with everyone simultaneously while taking into account potential last-minute scheduling conflicts.
Experimental Design
In order to study the effect of the AI suggestions on player performance, I split test subjects into two different groups: the “baseline” and the “treatment.” People in the baseline group do not have access to the AI feature, while people in the treatment group have access to and are somewhat encouraged to use the AI feature. Regardless of which group they are in, each participant will play the game five times. This mitigates the effect of luck (frequency of certain types of humanoids) on people’s scores, while also giving players a chance to empirically learn the hidden rules of scoring present in the game in order to maximize their score.
After collecting enough data from both groups, I will run two T-tests comparing players’ scores across the two groups. The first T-test will compare baseline and treatment scores for the first two games. Here, I expect a lower p-value (greater difference in average scores), since participants in the baseline group lack assistance in learning how the game works. However, over time, the baseline participants gain more information by observing the effect of their actions on their final score. This is effectively the same information that the treatment group received at the beginning, so I expect the average scores across all 5 games to be more similar.
Qualitative Analysis
In addition to the quantitative data that I plan to analyze next week, I also collected some qualitative information from the players via a post-experiment survey. After participants in the treatment group complete their last game, I ask them to rate the usefulness of the AI suggestion feature on a scale from 1 to 10. They are then given the opportunity to provide additional comments or feedback regarding the game in general. Interestingly, so far I have found that people who give the AI feature a higher rating are not necessarily those with higher scores among the treatment group. I suspect that this is because people who only use the AI briefly in the beginning to learn the game’s rules do not think it plays a large role in their game experience. However, players who rely more on the AI (but are not necessarily intrinsically better at the game) would likely tend to find the AI more useful.
Potential Problems
My experiment’s reliance on volunteer participants is one of its major flaws: I cannot fully control the players, and they lack a strong incentive to behave rationally. As such, I have already noticed that a few players in the treatment group intentionally avoid reading or following the AI’s suggestions in at least one of their games. This pattern of experimentation seems to be present in the baseline group as well, as some players deliberately choose actions that go against the instructions given to them at the beginning of the game.
Next Steps: Data Analysis + Presentation
Since there’s not much time left after playtesting, I will have to start working on my presentation and setting up the data analysis soon. At the moment, I am planning to use the SciPy package to run my T-tests and any other data analysis I might decide to do. As much as I would like to begin working on that now as a much-needed break from coordinating individual meeting times with many people at once, I’m too busy running playtests to do anything else (today I have 9 scheduled, in addition to people spontaneously setting up meetings when they’re available.) Hopefully I’ll be less busy soon…
Thank you so much for reading and see you next week!
Leave a Reply
You must be logged in to post a comment.