gemini - kennedy.gemi.dev

💾 Archived View for gemini.spam.works › mirrors › textfiles › politics › SPUNK › sp000161.txt captured on 2022-04-29 at 02:21:05.
-=-=-=-=-=-=-
ANARCHY AND GAME THEORY

Doug Newdick.

1. Introduction.

In any discussion of anarchism, or the conditions for a stateless society,
sooner or later a claim like this surfaces: " people are too selfish for
that to work". These, I believe, are based upon an assumption (or theory)
about human nature, that is taken to be evidently true, rather than argued
for. Often I hear a version of "I'm sorry but I just have a more
pessimistic view of people than you." This purpose of this essay is to show
that even if we grant the assumptions of selfish rationality then
cooperation without the state is still a possibility.

2. The anti-anarchist/Hobbesian argument.

2.1. The intuitive argument.

With these sorts of objections to anarchism ("people are to selfish to
cooperate without laws" etc) I think people are tacitly appealing to an
argument of the form:

1	People are selfish (rational egoists).
2.	Selfish people won't cooperate if they aren't forced to.
3. 	Anarchism involves the absence of force.
4.	Therefore people won't cooperate in an anarchy.

The opponent of anarchism can then say either; as anarchy also requires
cooperation, it involves a contradiction; or, a society without cooperation
would be awful, therefore an anarchy would be awful.

2.2. Taylor's (1987) version.

If we call the two options (strategies) available to the individual
cooperation (C) and defection (D)(non-cooperation) then we can see the
similarities between the intuitive argument and Taylor's (1987)
interpretation of Hobbes (1968) argument for the necessity for, or
justification of, the state: "(a) in the absence of any coercion, it is in
each individual's interest to choose strategy D ; the outcome of the game
is therefore mutual Defection; but every individual prefers the mutual
Cooperation outcome; (b) the only way to ensure that the preferred outcome
is obtained is to establish a government with sufficient power to ensure
that it is in every man's interest to choose C ." (Taylor 1987: 17)
We can see from this that the argument appears to be formalisable in terms
of Game Theory, specifically in the form of a prisoners' dilemma game.

3. The prisoners' dilemma.



3.1 The prisoners' dilemma.[1]

To say an individual is rational, in this context, is to say that she
maximises her payoffs. If an individual is egoistic (ie selfish) then his
payoff is solely in terms of his own utility. Thus the rational egoist will
choose those outcomes which have the highest utility for herself.
In the traditional illustration of the prisoners' dilemma, two criminals
have committed a heinous crime and have been captured by the police. The
police know that the two individuals have committed this crime, but do not
have enough evidence to convict them. However the police do have enough
evidence to convict them of a lesser offence.  The police (and perhaps a
clever prosecuting attorney) separate the two thugs and offer them each a
deal. The criminals each have two options: to remain quiet or to squeal on
their partner in crime. If they squeal on their companion and their
companion remains quiet they will get off, if both squeal they will receive
medium sentences, if they remain quiet and their companion squeals they
will receive the heaviest sentence, and if neither squeals then they will
each receive light sentences. The two are unable to communicate with each
other, and must make their decisions in ignorance of the other's choice.
There are four possible outcomes for each player in this game: getting off
scot free, which we will say has a utility of 4; getting a light sentence,
which has a utility of 3; getting a medium sentence, which has a utility of
2; and getting a heavy sentence, which has a utility of 1. If we label the
strategy of staying quiet "C" (for Cooperation), and label the strategy of
squealing "D" (for Defection), then we get the following payoff matrix:

			Player 2
			C	D

Player 	C	3, 3	1, 4
     1
    		D	4, 1	2, 2

(where each pair of payoffs is ordered: Player 1, Player 2)

It is obvious from this that no matter which strategy the other player
chooses each player is better off to Defect, therefore the rational choice
is to Defect (in Game-theory-speak Defection is the dominant strategy). As
this is the case for both players, the outcome of the game will be mutual
Defection. However there is an outcome, mutual Cooperation, which both
players prefer, but because they are rational egoists they cannot obtain
that outcome. This is the prisoners' dilemma.

More generally a prisoners' dilemma is a game with a payoff matrix of the form:

	C	D

 C	x, x	z, y

 D	y, z	w, w

Where y > x > w > z. (The convention is that the rows are chosen by player
1, the columns by player 2, and the payoffs are ordered "player 1, player
2".) (Taylor 1987: 14)

Any situation where the players' preferences can be modelled by this matrix
is a prisoners' dilemma.

3.2 Ramifications of the prisoners' dilemma.

Many people have proposed that the prisoners' dilemma is a good analysis of
the provision of public goods and/or collective action problems in general,
they have taken the preferences of individuals in cooperative enterprises
to be modelled by a prisoners' dilemma. Firstly, the prisoners' dilemma
gives an interesting look at so-called "free rider" problems in the
provision of public goods. In public goods interactions, free rider
problems emerge when a good is produced by a collectivity, and members of
the collectivity cannot be prevented from consuming that good (in Taylor's
terminology the good is non-excludable).[2]  In this case a rational
individual would prefer to reap the benefits of the good and to not
contribute to its provision (ie Defect), thus if others Cooperate then the
individual should Defect, and if everyone else Defects then the individual
should Defect.[3] Secondly, the prisoners' dilemma is taken to be a good
model of the preferences of individuals in their daily interactions with
other individuals, such as fulfilling (or not fulfilling) contractual
obligations, repaying debts,  and other reciprocal interactions.

3.3 My version of the anti-anarchist argument.

Given a game-theoretic interpretation of the claim in 1, and consequently a
game-theoretic interpretation of the intuitive and Hobbesian arguments for
the necessity of the state, we can reformulate them with the following
argument:

1. 	People are egoistic rational agents.
2. 	If people are egoistic rational agents then the provision of public
goods is a Prisoners' Dilemma (PD).
3. 	If the provision of public goods is a PD then, in the absence of
coercion, public goods won't be provided.
4. 	Such coercion can only be provided by the state, not by an anarchy.
5. 	Therefore public goods won't be provided in an anarchy.
6. 	Therefore the state is necessary for the provision of public goods.
7. 	The provision of public goods is necessary for a "good" society.
8. 	Therefore an anarchy won't be a "good" society.
9. 	Therefore the state is necessary for a "good" society.

4. Overview of my criticisms/position.

I think the game-theoretic model is the best (and most plausible) way of
interpreting these sorts of arguments. However I think that its premises 1
to 4 are false. Against premise 2, following Taylor (1987: ch 2), I argue
that the prisoner's dilemma is not the only plausible preference ordering
for collective action, and in some of these different games Cooperation is
more likely than in the prisoners' dilemma. The static model of the
prisoners' dilemma game is unrealistic in that most social interactions
reoccur, thus I argue a more realistic model is that of an iterated
prisoners' dilemma, where cooperation (under certain circumstances) is in
fact the optimal strategy (following Taylor 1987, and Axelrod 1984), thus 3
is argued to be false. Finally I argue that premise 1 is false, that indeed
we do and should expect people to be (somewhat limited) altruists.[4]
 
5. Provision of public goods isn't always a prisoners' dilemma.

For a game to be a prisoners' dilemma, it must fulfil certain conditions:
"each player must (a) prefer non-Cooperation if the other player does not 
Cooperate, (b) prefer non-Cooperation if the other player does  Cooperate.
In other words: (a') neither individual finds it profitable to provide any
of the public good by himself; and (b') the value to a player of the amount
of the public good provided by the other player alone (ie, the value of
being a free rider) exceeds the value to him of the total amount of the
public good provided by joint Cooperation less  his costs of Cooperation."
(Taylor 1987: 35)
For many public good situations either (a'), (b') or both fail to obtain.

5.1 Chicken games.

If condition (a') fails we can get what Taylor calls a Chicken game; ie, if
we get a situation where it pays a player to provide the public good even
if the other player Defects, but both players would prefer to let the other
player provide the good, we get this payoff matrix:

	C	D

 C	3, 3	2, 4

 D	4, 2	1, 1

Taylor (1987: 36) gives an example of two neighbouring farms maintaining an
irrigation system, where the result of mutual Defection is so disastrous
that either individual would prefer to maintain the system herself. Thus
this game will model certain kinds of reciprocal arrangements that are not
appropriately modelled by a prisoners' dilemma game.

5.2 Assurance games.

If condition (b') fails to obtain we can get what Taylor (1987:38) calls an
Assurance game, that is, a situation where neither player can provide a
sufficient amount of the good if they contribute alone, thus for each
player, if the other Defects then she  should also Defect, but if the other
Cooperates then she would prefer to Cooperate as well. Thus the payoff
matrix looks like this:

	C	D

 C	4, 4	1, 2

 D	2, 1	3, 3

5.3 Cooperation in a Chicken or Assurance game.

There should be no problem with mutual Cooperation in an Assurance game
(Taylor 1987: 39) because the preferred outcome for both players is that of
mutual Cooperation. With the one-off Chicken game mutual Cooperation is not
assured, however, mutual Cooperation is more likely than in a one-off
prisoners' dilemma. [5] 

6. Cooperation is rational in an iterated prisoners' dilemma.

6.1 Why iteration.

Unequivocally there is no chance for mutual Cooperation in a one-off
prisoners' dilemma, but as has been pointed out, the one-off game is not a
very realistic model of social interactions, especially public good
interactions (Taylor 1987: 60). Most social interactions involve repeated
interactions , sometimes as a group (an N-person game), or between specific
individuals (which might be modelled with a game between two players). The
question then becomes: Is mutual Cooperation more likely with iterated
games? (Specifically the iterated prisoners' dilemma). As one would expect,
the fact that the games are repeated (with the same players) opens up the
possibility of conditional Cooperation, ie Cooperation dependent upon the
past performance of the other player.

6.2 Iterated prisoners' dilemma.

There are two important assumptions to be made about iterated games.
Firstly, it is assumed (very plausibly) that the value of future games to a
player is less than the value of the current game. The amount by which the
value of future games  are discounted is called the discount value, the
higher the discount value the less future games are worth (Taylor 1987:
61). Secondly, it is assumed that the number of games to be played is
indefinite. If the number of games is known to the players then the
rational strategy will be to Defect on the last game, because they cannot
be punished for this by the other.  Once this is assumed by both players,
the second to last game becomes in effect the last game and so on (Taylor
1987: 62).

Axelrod (1984) used an ingenious method to test what would be the best
strategy for an iterated prisoners' dilemma, he held two round robin
computer tournaments where each different strategy (computer program)
competed against each of its rivals a number of times. Surprisingly the
simplest program, one called TIT FOR TAT, won both tournaments, as well as
all but one of a number of hypothetical tournaments. Axelrod's results
confirmed what Taylor had proven in 1976.[6] TIT FOR TAT is the strategy of
choosing C for the first game and thereafter choosing whatever the other
player chose last game (hereafter TIT FOR TAT will be designated strategy
B, following Taylor (1987)). 

An equilibrium in an iterated game is defined as "a strategy vector such
that no player can obtain a larger payoff using a different strategy while
the other players' strategies remain the same. An equilibrium, then, is
such that, if each player expects  it to be the outcome, he has no
incentive to use a different strategy" (Taylor 1987: 63). Put informally,
an equilibrium is a pair of strategies such that any move by a player away
from that strategy will not improve that player's payoff. Then mutual
Cooperation will arise if B is an equilibrium, because no strategy will do
better than B when playing against B. [7]

The payoff for a strategy in an (indefinite) iterated prisoners' dilemma is
equal to the sum of an infinite series:

x/(1 - w)

x = payoff
w = discount parameter (1 - discount value)

UD playing with UD gets a payoff of 2 per game for mutual Defection, if we
set w = 0.9 then UD's payoff is:

2/(1 - 0.9) = 20

B playing with B gets a payoff of 3 per game for mutual Cooperation, thus
with w = 0.9 B gets:

3/(1 - 0.9) = 30

(B, B) is an equilibrium when the payoff for B from (B, B) is higher than
the payoff for UD from (UD, B):

B's payoff against B is
3/(1-w)

UD's payoff against B is:
4 + 2w/(1-w)

Therefore UD cannot do better than B when:

(3/(1 - w)) > (4 + 2w/(1 - w))

=  w > (4 - 3)/(4 - 2)

= w > 0.5

(Axelrod 1984: 208)[8][9]

Can any other strategy fare better against B than B itself? Informally we
can see that this is not possible (assuming future interactions are not too
heavily discounted). For any strategy to do better than B, it must at some
point Defect. But if the strategy Defects, then B will punish this
Defection with a Defection of its own, which must result in the new
strategy doing worse than it would have had it Cooperated. Thus no strategy
can do better playing with B, than B itself.
Now if B is an equilibrium, then the payoff matrix for the iterated game is:

	B	UD

B	4, 4	1, 3

UD	3, 1	2, 2

Which is an Assurance game. Thus if B is an equilibrium then we should
expect mutual Cooperation. (Taylor 1987: 67) If, however, B isn't in
equilibrium (ie the discount value is too high) then the payoffs resemble a
prisoners' dilemma, and thus mutual defection will be the result (Taylor
1987: 67).

6.3 Iterated N-persons prisoners' dilemma.

A more realistic model of (some) social interactions, especially public
goods interactions, is that of an N-persons iterated prisoners' dilemma,
that is an iterated prisoners' dilemma with more than two players (an
indefinite number for the purposes of analysis). The analysis is too
complex to reproduce here[10] but the results of the analysis of the
2-person iterated prisoners' dilemma can be applied more or less
straightforwardly to the N-person case. If Cooperation is to arise at least
some of the players must be conditional Cooperators (ie utilising something
like B) and "it has been shown that under certain conditions  the
Cooperation of some or all of the players could  emerge in the supergame no
matter how many players there are." (Taylor 1987: 104)

6.4 Conditions for conditional cooperation.

For mutual Cooperation to arise, a strategy similar to B needs to be used
by individuals, and (B, B) needs to be an equilibrium. For the latter to be
the case, the discount parameter needs to be sufficiently high. For the
former, individuals need to be able to tell whether other individuals are
Cooperating or Defecting.
The discount parameter is dependent upon the chance of the player having
further interactions with that player, and the frequency with which they
have interactions. The greater the probable time between interactions, and
the smaller the number of probable interactions, the lower the discount
parameter and the lower the chance of getting mutual Cooperation. There are
a number of ways in which the discount parameter can be increased (Axelrod
1984: 129-132): increasing territoriality (reducing population mobility);
increasing specialisation; concentrating interactions, so that an
individual has more interactions with a smaller number of individuals;
decomposing interactions into more smaller interactions.
If people are to employ a strategy such as B, they need to be able to
monitor the behaviour of other players. Thus it seems that mutual
Cooperation will be more likely in smaller societies than in larger ones.
If the relations between individuals are direct and many-sided (ie, they
interact with others without any mediation, and they interact with them in
a number of different ways) then monitoring behaviour is much easier. This
would translate into a less stringent size requirement. Such properties are
to be found in societies that have the property of "community" (Taylor
1987: 105, 1982).

6.5 The evolution of TIT FOR TAT

As TIT FOR TAT is the best strategy under certain conditions, we would
expect that organisms that evolved in these conditions might well use this
strategy as an adaptation.[footnote: With all of the usual riders such as:
the variation might not have arisen; constraints of other structures might
prevent this etc.] This expectation is supported by a number of apparent
examples of TIT FOR TAT behaviour amongst certain organisms that do live
under iterated prisoners' dilemma conditions (Dawkins 1989: 229-233). If
much human social interaction does take the form of a prisoners' dilemma
(and we have seen that if this is the case then these will mostly be
iterated), and if we assume that much of the evolutionary history of humans
and their ancestors was spent in small groups (as evidence suggests), then
we might expect that humans might have evolved such a behavioural strategy.
One must be wary of drawing to strong a conclusion about humans and human
behaviour from evolutionary arguments. Human behaviour is notoriously
complex and very plastic, unlike much animal behaviour. However I do think
that this argument gives an additional reason for being optimistic about
the possibility for mutual Cooperation.

7. Altruism.

7.1 Altruism is not a rare phenomenon.

The purpose of the preceding section was to show that even if we grant the
anti-anarchist her mot pessimistic assumptions about humans (that they are
rational egoists) and social interactions (that they have the preference
structure of a prisoners' dilemma) mutual Cooperation can still be
achieved. I have already criticised the last assumption in S5, but the
former assumption, too, is obviously flawed[footnote: This assumption is
acceptable as an idealisation when we have a specific explanatory or
predictive purpose in mind (presuming it does not give us bad results), but
in this justificatory role its inadequacies are  central to the question at
hand.]. People are not egoistic. If we think for more than a few moments we
should be able to come up with a number of examples of pure  altruism,
examples where no benefit whatsoever accrues to the performer of the
action, not to mention examples of impure altruism. Donating blood is a
good example of pure altruism: no (measurable) benefit accrues to someone
who donates blood (without publicising it), yet the benefit to others could
be great, and there is a cost (even if it is not substantial). Then there
are examples such as child-rearing. The cost of rearing a child is
substantial, both in terms of monetary and other resources (eg time, missed
opportunities etc), yet the benefit mainly accrues to the child, not the
parent.

7.2 Kin Selection.

An explanation for certain kinds of apparent altruism, and possibly for a
greater than expected degree of reciprocal Cooperation, can be found in the
theory of Kin Selection. Taking the gene's-eye-view proposed by Dawkins
(1989)[11], imagine a gene for green beards. If this gene, besides causing
green beards, causes the carrier of the gene to help other individuals with
green beards, it has a greater than usual chance for spreading through a
population. In a normal population an organism is more likely to share
genes with its relations than with another member of the population. For
any gene that is in your body, there is a 50% chance that it is in the body
of your sibling, there is a 25% chance, for each of your cousins, that the
gene is in their body. Thus from the gene's perspective, if you sacrifice
yourself to save the lives of three of your siblings, then the gene has in
fact gained (because more copies of it were preserved than perished). This
is the mechanism of kin selection. The closer you are related to someone,
the more it benefits the unit of selection (the entity which benefits from
natural selection), in this case the gene, if you aid them, with the amount
of aid directly proportional to the index of relatedness (Dawkins 1989: ch
6). In game theoretic terms, in any game the payoff to the gene is equal to
the utility to the individual it is in plus the utility to the other
individual times their index of relatedness:


The payoff in games  between kin for player 1 = z + xy
where: 	x = index relatedness.
		y = player 2's utility.
		z = player 1's utility

Index of relatedness = the chance that a gene in X is present in their
relation Y.

For example, the value of x if the two players are siblings, is 0.5, thus
the transformed prisoners' dilemma will look like:

	C		D

C	4.5, 4.5	3, 4.5

D	4.5, 3		3, 3

In this case we should expect mutual Cooperation to be the outcome, because
it is an equilibrium and is preferred by both players.

As we know from S6 the value of the discount parameter required for (B, B)
to be an equilibrium decreases as the difference between the payoff for
Defecting whilst the other player Cooperates and the payoff for mutual
Cooperation decreases. Thus mutual Cooperation is easier to achieve when
the mechanism of kin selection is operating.

It is also possible that such a mechanism might over generalise, that is:
identify too many people as being related enough to alter behaviour in
prisoners' dilemma type situations. When you consider that much of our
recent evolutionary history, humans have lived in small bands where the
average index of relatedness was fairly high (especially compared to
today), such generalisations would not have generated many false
positives.[12]
The mutual Cooperation engendered by kin selection can help the spread of
reciprocal Cooperation. It can create a large enough cluster of conditional
Cooperators to make conditional Cooperation the best strategy in the
population. If a cluster of conditional Cooperators invades a population of
Unconditional Defectors, once the number of conditional Cooperators reaches
a certain level (dependent upon the discount parameter), the conditional
Cooperators earn more than the Unconditional Defectors in virtue of their
interactions with each other (Axelrod 1984: ch 3).

8. Summary

I have shown that premises 2 and 3 of the intuitive/Hobbesian argument are
false. Therefore the conclusions that anarchies are non-viable, and that
the state is in a sense necessary, do not follow. The analysis of the
iterated prisoners' dilemma shows that even if we grant the opponent of
anarchy their best case, their conclusion just does not follow. Game theory
shows us that even egoistic individuals will cooperate without coercion or
coordination, given certain conditions. Certain conditions which are
practically possible. When added to Taylor's (1982) thesis that coercion
can be utilised by an anarchic community to encourage Cooperation, the
plausibility of an anarchy increases. I think that the analysis from game
theory and kin selection should leave us optimistic about the possibility
of Cooperation without coercion, even under adverse circumstances, and thus
the changes in human nature required for a viable anarchy are much less
than the opponents of anarchy believe.

Bibliography.


Axelrod, 1984, The Evolution of Cooperation, Basic Books, 

Dawkins, 1989, The Selfish Gene, Oxford University Press, Oxford.

Hardin, 1982, Collective Action, John Hopkins University Press, Baltimore.

Hobbes, 1968, Leviathan, ed C.B. MacPherson, Pelican Classics.

Lewontin et al, 1984, Not In Our Genes, Pantheon, New York.

Lukes, 1974, Power: A Radical View, Macmillan Press.

Mansbridge (ed), 1990, Beyond Self-Interest, University of Chicago Press,
Chicago.

Palfrey & Rosenthal, 1992, "Repeated Play, Cooperation and Coordination: An
Experimental study", Social Science Working Paper 785, California Institute
of Tecnology, Pasadena.

Taylor, 1982, Community, Anarchy & Liberty, Cambridge University Press,
Cambridge.

----, 1987, The Possibility of Cooperation, Cambridge University Press,
Cambridge. 

---- (ed), 1988, Rationality and Revolution, Cambridge University Press,
Cambridge.

Wright et al, 1992, Reconstructing Marxism, Verso, London.
Footnotes

1: Much of this section is drawn from Taylor 1987 and Axelrod 1984.

2: Taylor (1987: 6) says that free rider problems arise only when the
collective good is non-excludable but not indivisible (that is when
consumption of the good by an individual results in less of the good being
available to others). I don't believe that this is the case, we are surely
able to free ride on the public good of parklands etc, by not paying our
taxes.

3: This is really an example of an N-persons prisoners' dilemma, rather
than a normal prisoners' dilemma. See Taylor 1987: ch 4.

4: Taylor 1982 can be taken as an argument against premise 4, I concur but
will not go into that argument here.

5: For a full presentation of the mathematical argument for this conclusion
see Taylor 1987: 39-59.

6: In his book "Anarchy and Cooperation". Taylor 1987 is a substantial
revision of this book. Taylor (1987: 70) points out that he had already
proven what Axelrod proved with his tournaments, however Axelrod's method
was more interesting.

7: Note that unconditional Defection (UD) is an equilibrium, any strategy
that Cooperates at any point with UD will score less than UD in that game.

8: B also has to do better than a strategy that alternates Cooperation with
Defection, which also occurs when w > 0.5

9: Strictly speaking (B, B) being an equilibrium is a function of the
relation between w and the value of the payoffs. Thus (B, B) is an
equilibrium when: w > (y - x)/(y - w) or w > (y - x)/(x - z). For the
payoffs I am using, this is the case if w > 0.5

10: See Taylor 1987: ch 4, for a detailed analysis of N-person iterated
prisoners' dilemma.

11: This is bad philosophy of biology, but it gets the point across easily.

12: yet again this argument should not be taken too seriously, but merely
adds additional reasons to be optimistic that humans are more inclined
towards mutual Cooperation than is predicted by the purely egoistic model.