One round of the game consists of each of two bots simultaneously generating either a 0 or a 1. Then, each bot's score is adjusted as follows.
(player 1's score adjustment, player 2's score adjustment) =
Player 1 selection
0 1
Player 0 (2, 2) (0, 10)
2 sel-
ection 1 (10, 0) (5, 5)
Every bot plays 50 rounds with every other bot. The object of the game is to get as low a total score as possible. Obviously, if only one round is played, then the optimal strategy is to always choose a 1 since doing so will always result in a lower score, at the opponent's expense. However, since many rounds are played, it is possible for the two bots to develop some sort of trust, such that both choose 0 for mutual gain.
This game is called the prisoner's dilemma because it is based on a logic problem involving two prisoners, isolated from each other, from whom the police are trying to get a confession. If neither confesses, there will be insufficient evidence against them and they can only be put away on a lesser charge. If one confesses and the other does not, the one who confesses goes free on immunity. If both confess, they each get a reduced sentence on plea-bargain. In this game, confessing corresponds to choosing 1, and keeping quiet corresponds to choosing 0.
|