Week 3: Mathematical Foundations
March 18, 2024
Hello everyone!
Welcome back to my blog! Today, I will be discussing the mathematical concepts behind my project.
In Game Theory, we often use payoff matrices to define the parameters of two-player games. In a payoff matrix, each cell represents a potential outcome; the first number representing the payoff of the first player, the second representing the payoff of the second player. In the prisoner’s dilemma, the payoff matrix will look like this:
where T > R > U > S.
Traditionally, the Nash Equilibrium, or optimal strategy for both players, would be for both players to defect. We would call defecting a dominant strategy, since no matter what the other player chooses to do, it gives more payoff for you to defect, rather than to cooperate. Yet, you may also notice that both players cooperating would lead to a better outcome for both players. So, we would call the outcome of both players defecting to be Pareto dominated (in other words, worse for both players), and the outcome of both players cooperating to be Pareto efficient (none of the other outcomes Pareto dominate it).
The basis of my project is to try to reach that improve the Nash Equilibrium and find ways to reach the Pareto efficient outcomes. In my previous blog post, I have explained ways in which people have managed to reach this Pareto efficient outcome; either through doing repeated games, through designing programs to perform your strategies, or through using concepts like quantum entanglement. In my project, I will be offering a different solution, which involves allowing the players to modify the payoff matrices to their benefit.
Imagine, for a moment, that the first player could offer a promise to the second player, that “if my payoff is at least T, I will only keep A of it and give you the rest.” Then, our new payoff matrix would look like this:
Assuming that S+T-A > U, a dominant strategy no longer exists for either player. Thus, the new Nash Equilibrium must be a mixed strategy, where each player chooses to cooperate with a certain probability, and to defect the rest of the time. Given S+T-A > U, it is clear that S+T must be greater than 2U in order for this new situation to be productive.
The expected payoff for the first player turns out to be \frac{U+R-A-S}{UR-AS}. Taking its derivative, we find that the first player’s expected payoff increases as A increases. For this reason, we can split this into two cases:
If S+T > R+U, then, if A > R, it becomes a dominant strategy for player 1 to defect, but for player 2 to cooperate. The expected payoff of player 1 would be A.
If S+T < R+U, then, A approaches S+T-U. In this case, the new payoff is \frac{U+R-S-T+U-S}{UR-(S+T-U)S}. After some algebraic manipulation, it can be shown that as long as S+T > 2U, the expected payoff is greater than U, resulting in a better situation for both players.
The math shows that the first person can in effect “sacrifice” some of their utility to bring the outcome of the game closer to being Pareto efficient, resulting in increases in expected payoff for both players.
That’s it for this week’s post. Stay tuned for next week, where I will be diving slightly deeper into the underlying mathematics, as well as talking about the practicality and implications of these ideas. See you then!
Leave a Reply
You must be logged in to post a comment.