# Week 2: Literature Review

March 12, 2024

Hello everyone,

Welcome back to my blog! Today, I will be sharing the literature review I completed throughout this week. In doing research for this project, I first looked into the details of cooperative game theory, learning new concepts like Shapley and Banzhaf values. Then, I looked into ways other people have used to reach the Pareto efficient outcome.

The first paper I looked at was this: https://intelligence.org/files/ProgramEquilibrium.pdf. The paper went into how programs would be able to reach a Nash Equilibrium, which they coined “Program Equilibrium.” This would spawn solutions that would in fact reach the cooperative equilibrium at certain points. For example, a program which cooperates when the source code of the other program is the same would be a basic case of this, although programs who cooperate much more often are possible, as shown in the paper.

The second concept I investigated was the idea of a repeated equilibrium being different from the Nash Equilibrium of a one-shot game. This concept is very well-documented, and in a large part is the origin of the field of evolutionary game theory. Here’s the idea: let’s say that a strategy is comprised of a piece of code, whose input is the previous results (cooperate or defect) of the other player. Then, strategies that cooperate will generate more value over time, as a player who is defecting will likely keep facing defecting, while a player who is cooperative would more likely have their counterparts cooperative.

The third concept I looked into was the idea of using quantum mechanics to model games. There were many papers that investigated this topic, but I particularly focused on https://arxiv.org/pdf/1010.0047.pdf and https://arxiv.org/pdf/quant-ph/9806088.pdf. The premise was as follows. If we model cooperate and defect as the two orthonormal vectors of a Hilbert space, H, we can view the strategy profiles of each player as a unitary operator within this space. Then, once the players decide upon their strategy, an arbitrator can take a measurement of the tensor product of those operators, then delegating payoffs to each player. Under this model, it turns out that the strategy profile of always defecting turns out to no longer be a Nash equilibrium, resulting in a Pareto efficient outcome. However, this model has many drawbacks, such as the unrealistic requirement of an arbitrator, and is largely infeasible to implement into everyday life.

Next week, I am planning to work through the math behind my ideas and see if I can extend them to games beyond the prisoners’ dilemma, and also see how its drawbacks and strengths stack up to some of the other strategies I have researched.

That’s all for today, make sure to tune in next week to see some of the theoretical mathematical concepts I investigate!

## Leave a Reply

You must be logged in to post a comment.