Probability And Randomness: Introducing Independence
What are the chances of a coin toss coming up heads again after landing heads ten straight times? What is an independent event in probability theory? And what is the Gambler's Fallacy? Today on the blog Dominic Yeo introduces us to the concept of independence in the first of his series exploring probability and randomness.
In upcoming posts, I will be explaining the mathematical theory underlying the fascinating topic of probability and randomness. As with so many things, it is easy to descend quickly into lots of technicalities, so I’m going to try to use as many topical examples as possible, and opt for pictures rather than the much-dreaded algebra symbols wherever possible.
So, my focus for this first post lies with a simple question:
“If I toss a coin ten times, and it comes up heads every time, is the next toss more likely to be a head or a tail?”
Considering A Coin Toss
It is a natural human tendency to offer emotional or psychological arguments even in situations which are set up entirely abstractly. Let’s consider two possible responses to the situation described just above. Optimistic Olivia might say “Heads are clearly on a roll, so it’s more likely that the next result will be a head also.” To which Pessimistic Peter responds “No, but there hasn’t been a tail for ages, so one must be due soon.”
The choice of ridiculous names is deliberate. If both people have some genuine preference for heads appearing, then the excitement of finding out what will happen next has caused them to forget the underlying mechanism. This is not a sequence of football matches, where confidence and a massive collection of psychological consequences from previous games can play a role, but a single coin toss. Such a coin toss is supposed to be fair not just once but every time!
One possibility is that the coin under discussion is in fact a joke coin, with heads on both sides. We can obviously exclude this by checking it first. However, it is worth remarking that if a friend told us that he’d tossed a coin six million times and it came up with heads every time, we should probably be suspicious about whether the coin was really fair after all!
For now though, we exclude that possibility, and assume we have verified that the coin is genuine. Now, if a physicist knew the exact weight and shape of the coin, which way up it was placed on your hand, exactly how fast your fingers were moving and where all the particles in the air nearby were sitting, they could in theory calculate which way up it would land. This would mean that the coin toss would not be random. This is impossible in practice, since it is unrealistic to assemble such a large amount of information.
In general, the coin rotates in the air a large number of times, so it is roughly equally likely that it will spin an even number of times and land on heads as an odd number of times and so land on tails. But it is not possible to extend this physical description to more general random settings. For example, a computer can be programmed to simulate a coin toss, but it is not randomly spinning a physical object.
Assigning Probability To An Event
So if we are going to say that the probability of getting a head is 1/2 as we might expect, first we need some idea of what it means to assign some value as a probability. It is confusing that such values are often very subjective. For example, when it is announced that there is a 30% chance of rain tomorrow, this is based on some calculations done by the weather forecasters. But there is no objective way to tell whether they should instead have announced a 40% chance of showers, because we only get one opportunity to see whether or not it rained.
If someone says that there is a 2% chance of an earthquake, they are stating that they think such an event is very unlikely to occur. If an earthquake does then occur, it is tempting to look back and say that they shouldn’t have given such a low estimate. But of course, they might still have been right – even a very unlikely event can still occur. That is precisely the difference between ‘very unlikely’ and ‘impossible’. A failure to understand exactly this led to six Italian seismology experts being imprisoned after the 2009 earthquake in L’Aquila.
This isn’t a problem when it comes to coin-tossing. Here, we have the option to repeat the experiment as often as we want. If we toss the coin two million times, we would expect to have roughly one million heads and one million tails, that is, roughly half heads, half tails. This result is called the Law of Large Numbers. The answer suggested by Peter to the original problem makes reference to this. Intuitively, it might seem that if we have a run of heads, but know that we need equal proportions eventually, then we must need some extra tails soon. But the key word is ‘large’. The law may be applicable after 10 tosses or after a billion, or more. In that case, the first ten, or one hundred, or however many tosses are irrelevant.
This gives us a plausible definition for probability:
The probability of an event is roughly the fraction of times we observe the event if we repeat the experiment a large number of times.
Independence Of Events
The key assumption we have to make is called independence. This says that the outcome of previous coin tosses has no influence on any future outcome. This answers our original question: the outcome of the eleventh toss is independent, that is unaffected by the first ten coin tosses, so a head and a tail are equally likely.
The main advantage of assuming independence is that it becomes easy to calculate the probability of compound events. For example, if we look at the four possible results of the first two coin tosses, these are Heads-Heads, Heads-Tails, Tails-Heads, Tails-Tails, and all are equally likely by assumption, so the probability of getting two tails is 1/4. In an example where you don’t have equally likely outcomes, suppose there is a 1/3 chance of rain on Monday and on Tuesday, and the weather on the two days is independent. Then we can work out the probability that it rains on both days by multiplying to get 1/9.
So independence is an assumption that is often useful to make about sequences of observations. Two events are independent, if knowing whether the first has occurred has no effect on the probability of the second. One formal definition is that if events A and B are independent then:
Probability ( A AND B ) = Probability ( A ) x Probability ( B ).
To put this in a betting context, consider you have a double on Manchester United to defeat Chelsea and Liverpool to defeat Arsenal. Manchester United are at odds of 2.00 to defeat Chelsea and Liverpool are at odds of 4.00 to defeat Arsenal.
Manchester United have been assessed as a 50% chance of winning (implied probability of 2.00 odds) by the bookmakers and Liverpool have been assessed as a 25% chance of beating Arsenal (implied probability of 4.00 odds).
So the odds of both teams winning, as both are independent events, is calculated thus:
0.5 x 0.25 = 0.125 = 12.5% probability = Odds of 8.00
So we have seen that we need to think carefully about what it means to say that an event happens with some probability, and have introduced the notion of independence as a standard assumption for doing calculations with abstract examples.
With this assumption, we can answer the original question, which is known as the Gambler’s Fallacy.
In the next post, we will discuss when independence is a good assumption for real-world examples, and discover some of the surprising effects that can be seen when it is applied wrongly.
Follow Dominic on Twitter: @DominicJYeo
Read more of Dominic's work on his blog EventuallyAlmostEverywhere.wordpress