I was brushing up on my maximum entropy and probability theory the other day and came across a great passage in Jaynes' book about convergence and divergence of views. He applies basic Bayesian probability theory to the concept of changing public opinion in the face of new data, especially the effect prior states of knowledge (prior probabilities) can have on the dynamics. The initial portion of section 5.3 is reproduced below.
5.3 Converging and diverging views (pp. 126 – 129)
Suppose that two people, Mr A and Mr B have differing views (due to their differing prior information) about some issue, say the truth or falsity of some controversial proposition S. Now we give them both a number of new pieces of information or ’data’, D1,D2,…,Dn, some favorable to S, some unfavorable. As n increases, the totality of their information comes to be more nearly the same, therefore we might expect that their opinions about S will converge toward a common agreement. Indeed, some authors consider this so obvious that they see no need to demonstrate it explicitly, while Howson and Urbach (1989, p. 290) claim to have demonstrated it.
Nevertheless, let us see for ourselves whether probability theory can reproduce such phenomena. Denote the prior information by IA, IB, respectively, and let Mr A be initially a believer, Mr B be a doubter:
| (5.16) |
after receiving data D, their posterior probabilities are changed to
| (5.17) |
| (5.17) |
If D supports S, then since Mr A already considers S almost certainly true, we have P(D|S IA), and so
| (5.18) |
Data D have no appreciable effect on Mr A’s opinion. But now one would think that if Mr B reasons soundly, he must recognize that P(D|S IB) > P(D|IB), and thus
| (5.19) |
Mr B’s opinion should be changed in the direction of Mr A’s. Likewise, if D had tended to refute
S, one would expect that Mr B’s opinions are little changed by it, whereas Mr A’s will move in the direction of Mr B’s. From this we might conjecture that, whatever the new information D, it should tend to bring different people into closer agreement with each other, in the sense that
| (5.20) |
Although this can be verified in special cases, it is not true in general.
Is there some other measure of ‘closeness of agreement’ such as log[P(S|D Ia)∕P(S|D IB], for which this converging of opinions can be proved as a general theorem? Not even this is possible; the failure of probability theory to give this expected result tells us that convergence of views is not a general phenomenon. For robots and humans who reason according to the consistency desiderata of Chapter 1, something more subtle and sophisticated is at work.
Indeed, in practice we find that this convergence of opinions usually happens for small children; for adults it happens sometimes but not always. For example, new experimental evidence does cause scientists to come into closer agreement with each other about the explanation of a phenomenon.
Then it might be thought (and for some it is an article of faith in democracy) that open discussion of public issues would tend to bring about a general consensus on them. On the contrary, we observe repeatedly that when some controversial issue has been discussed vigorously for a few years, society becomes polarized into opposite extreme camps; it is almost impossible to find anyone who retains a moderate view. The Dreyfus affair in France which tore the nation apart for 20 years, is one of the most thoroughly documented examples of this (Bredin, 1986). Today, such issues as nuclear power, abortion, criminal justice, etc., are following the same course. New information given simultaneously to different people may cause a convergence of views; but it may equally well cause a divergence.
This divergence phenomenon is observed also in relatively well-controlled psychological experiments. Some have concluded that people reason in a basically irrational way; prejudices seem to be strengthened by new information which ought to have the opposite effect. Kahneman and Tversky (1972) draw the opposite conclusion from such psychological tests, and consider them an argument against Bayesian methods.
But now in view of the above ESP example, we wonder whether probability theory might also account for this divergence and indicate that people may be, after all, thinking in a reasonably rational, Bayesian way (i.e. in a way consistent with their prior information and prior beliefs). The key to the ESP example is that our new information was not
S ≡ fully adequate precautions against error or deception were taken, and Mrs Stewart did in fact deliver that phenomenal performance.
It was that some ESP researcher has claimed that S is true. But if our prior probability for S is lower than our prior probability that we are being deceived, hearing this claim has the opposite effect on our state of belief from what the claimant intended.
The same is true in science and politics; the new information a scientist gets is not that an experiment did in fact yield this result, with adequate protection against error. It is that some colleague has claimed that it did. The information we get from TV evening news is not that a certain event actually happened in a certain way; it is that some news reporter claimed that it did.
Scientists can reach agreement quickly because we trust our experimental colleagues to have high standards of intellectual honesty and sharp perception to detect possible sources of error. And this belief is justified because, after all, hundreds of new experiments are reported every month, but only about once in a decade is an experiment reported that turns out later to have been wrong. So our prior probability for deception is very low; like trusting children, we believe what experimentalists tell us.
In politics, we have a very different situation. Not only do we doubt a politician’s promises, few people believe that news reporters deal truthfully and objectively with economic, social, or political topics. We are convinced that virtually all news reporting is selective and distorted, designed not to report the facts, but to indoctrinate us in the reporter’s socio-political views. And this belief is justified abundantly by the internal evidence in the reporter’s own product – every choice of words and inflection of voice shifting the bias invariably in the same direction.
Not only in political speeches and news reporting, but wherever we seek for information on political matters, we run up against this same obstacle; we cannot trust anyone to tell us the truth, because we perceive that everyone who wants to talk about it is motivated either by self-interest or by ideology. In political matters, whatever the source of information, our prior probability for deception is always very high. However, it is not obvious whether this alone can prevent us from coming to agreement.
Jaynes, E.T., Probability Theory: The Logic of Science (Vol 1), Cambridge University Press, 2003.