# Calculating belief

In previous posts I referenced C.S. Peirce’s (1839-1914) concept of abduction, or evidential reasoning, i.e. establishing the support a particular proposition (hypothesis) has from evidence. The theories of the Presbyterian minister/statistician Thomas Bayes (1702-1761) has relevance here. Peirce knew of Bayes’s work on probability though he didn’t support Bayes’s approach to mathematical reasoning.

Never-the-less, Bayesian approaches to handling data and evidence are gaining ground as researchers devise yet better algorithms to process big data. What is a Bayesian approach to calculating the likelihood that global climate change is correlated with (caused by) human intervention?

### Bean bags

Here’s a much simpler problem than climate change. Bayes helps us estimate which bag a sample of beans comes from.

An experimenter has two bags containing the same number of beans. One bag contains 10 white beans and 30 black beans. The second bag contains 20 of each, black and white beans. The bags look identical. At random, the experimenter secretly reaches into one of the bags and draws out a bean. It happens to be black. You are the subject (participant) in this simple experiment. The experimenter asks you: did the black bean come from the first bag (with more black beans than white beans) or the second bag (with an even number of black and white beans)?

There’s no way of knowing for sure which bag the bean comes from, but you could calculate the probability that it came from either bag. It’s more likely that it came from the bag with most black beans, i.e. the first bag. In fact the odds are 60:40 in favour of the first bag, i.e. the probability is 0.6.

Here’s how that is calculated from Bayes’s Theorem. There are two mutually exclusive hypotheses. The black bean either comes from the first bag, or it comes from the second bag. Here’s what you know already.

• There are 80 beans in total.
• The probability that a bean taken at random from the first bag is black, is 0.75 (30/40).
• The probability that either bag was selected randomly by the experimenter is 0.5 (there are just two bags).
• Taking the contents of both bags together (i.e. 30 white beans and 50 black beans), the probability that someone would randomly pick a black bean is 50/80 = 0.625.

According to Bayes’s Theorem (below) the probability that the bean comes from the first bag is (0.75 x 0.5)/0.625 = 0.6

What about the hypothesis that the bean is from the second bag? The probability that a bean taken at random from the second bag is black, is 0.25 (10/40). The other probabilities remain the same. So the probability that the black bean came from the second bag is (0.25 x 0.5)/0.625 = 0.2.

### Repeating the experiment

If the experimenter puts the bean back into its correct bag, and keeps repeating the experiment, pulling out black and/or white beans at random, then you would put your money on the first bag every time you saw a black bean. You would bet your money on the second bag every time you saw a white bean. You would lose some times, but over a number of iterations you would win. (That’s probability!)

Here’s Bayes’s Theorem, taken from the excellent wikipedia entry (including the graphic of this formula). • P(H|E) means the probability of the hypothesis, i.e. which bag the bean comes from, given the evidence (a black or white bean). This is also known as the posterior probability. This is what you want to find out, and there will be competing hypotheses. In this case the two bags as bean source represent two hypotheses.
• P(E|H) is the probability that the evidence follows from the hypothesis, i.e. if you knew definitely that the bean is from the first bag, what is the probability that it is black?
• P(E) is the probability of the evidence appearing, whatever its source, i.e. the chance that you are handed a black bean (as if all the beans from both bags are tipped into a single new container and the bean is taken from that).
• P(H) is the probability of either hypothesis given no particular evidence at all. This is called the prior probability. In this case, without looking at beans, there’s a 0.5 chance any bean is from the 1st bag; the same 0.5 chance applies to the second bag as well. When comparing hypotheses this factor can be discarded, as the evidence is equally likely to appear whatever the hypothesis. ### Doubt

This experiment assumes you know about the number of each bean colour in each bag, but imagine you are not so sure. The experimenter could be having you on, or get confused about the protocol. You may spy on what the experimenter is doing secretly, interrogate the experimenter on the protocol, or find other clues about the bag contents, etc.

So the probabilities may be more conjectural on your part. That’s the usual state of any real-world problem dealing in probabilities, such as gathering evidence for the human causes of climate change. So you might have to make a best guess at the various probabilities. How would you apply Bayes’s Theorem to a belief that climate change is man made, as proposed in the previous post? Here’s my attempt at a Bayesian formulation based on an analogy with the bean problem.

### A bag of hurricanes

There are two bags: one labelled “humans are responsible for climate change,” the second is the converse (“climate change would happen anyway whatever humans did”).

In both bags there are some evidential propositions (phenomena). Instead of just black and white beans, there are lots of different evidential phenomena as data: not just hurricane events, icebergs, and forest fires, but statistics about the frequency of these, along with historical data about pollution levels, CO2 emissions and levels, deforestation and forest regeneration, energy consumption, and agricultural practices, as well as other climate change factors such as volcanic eruptions, natural weather cycles, temperature readings over time, sea temperature changes, etc, etc.

For any one of these bits of evidence a Bayesian approach would set up the problem as follows.

• P(H|E) means the probability that humans do or do not cause climate change given a new bit of evidence (e.g. a new bit of data about hurricane frequency). This is what we want to find out for each bit of evidence, then combine them in support of either hypothesis — in the same way that the experimenter iterates the bean bag identification task.
• P(E|H) is the probability that the evidence follows from the hypothesis, i.e. if you knew definitely that human intervention results in the evidential phenomenon, what is the probability of that evidence? E.g. the likelihood that human activity results in increases in CO2 emissions, and the likelihood that a volcanic eruption does something similar. So CO2 emission stats would appear in both hypothesis “bags.” That’s difficult to obtain and would be sourced from empirical evidence, scientific modelling and expert speculation. (Note that this challenge is different to finding out if the evidence supports the hypothesis — which is P(H|E). It’s the simpler assumption that if we believe human intervention causes climate change then what would we expect to happen to sea levels, etc.)
• P(E) is the probability of the evidence appearing, whatever its source, i.e. the chance of the evidential phenomenon whatever the hypotheses. That’s relatively easy to ascertain from records. We know the frequency of hurricanes and forest fires, sea temperature levels, and CO2 readings in the atmosphere whatever caused them, though people still dispute such figures.
• P(H) is the probability of either hypothesis given no particular evidence at all. This is called the prior probability. In this case we could say there’s a 0.5 chance of either hypothesis being true, as this prior probability gets factored out when comparing evidence iteratively in support of both hypotheses — though by some means or other the US EPA scientists have demonstrated that there’s a 0.92 chance humans have contributed to global temperature rises.

So Bayes’s Theorem provides a way not only of calculating belief, but accounting for some of the controversies amongst believers and doubters.

### References

I’m grateful to YiPing Cao and Rong Rong for providing the following corrections to my calculation above [12 Oct 2019]: 