A recent posting by Alex Bellos has made me puzzle, once again, about probability.
I spent about five years teaching probability to first-year undergraduates. This had one effect on my thinking: it converted me into a covert Bayesian. I would say that all probability is conditional probabiity, conditional on everything that I know at the moment. If I say that the probability that a fair coin comes down heads is 1/2, this is because I have no further information either way; if I know something about the person tossing the coin, the situation of the coin toss, or anything else, I may revise my view.
But I have no satisfactory answer to the question “What is probability?”, and my recurring nightmare at that time was that a student would ask me that question and I would be unable to answer it.
The Monty Hall problem has been back in the news at present, partly because an entire book about it has just appeared: The Monty Hall Problem: The Remarkable Story of Math’s Most Contentious Brain Teaser by Jason Rosenhouse. (I don’t agree with the subtitle, but let that pass.) I never had any difficulty with the Monty Hall problem: the right answer seems obvious to me.
To recall: Monty Hall, a game show host, shows you three doors. Behind one door is a car, behind the other two are goats. Assume that you want to get the car and have no interest in goats. Monty invites you to choose a door. Then he opens another door, and shows you a goat; he asks you if you want to stick with your original choice or switch to the other door. Clearly, if you switch, you double your chances of winning the car. (Why is it clear? The chance that the car is behind the door you first choose is 1/3, and because Monty acts in such a way that you gain no information about the door you chose, this is unchanged; and after he opens a door, you know that the car is behind either the door you chose, or the remaining door.)
Purists will (and do) object to my statement of the problem. I should specify the algorithm that Monty uses to choose a door to open. I don’t think so. Monty is the host; he knows where the car is, and has no intention of showing you a car, or of giving you any information about where the car is beyond the fact that it is not behind the door he opened. Moreover, you have never been on (or even watched) the show before. It is always possible for Monty to open a door and show you a goat, and there is no doubt that he will do that. The condition about your ignorance is inserted because, if Monty had even a small bias towards one particular door (say the leftmost one) when he has a choice, then your knowledge of this would affect your calculation of the probabilities.
But this example described by Alex has me flummoxed. First, the background. If somebody says to you, “I have two children, and (at least) one is a boy,” what is the probability that both children are boys? Assuming that boys and girls are equally likely (not quite true, but let’s ignore that), the four possible combinations in birth order (BB, BG, GB and GG) are equally likely; the given information rules out GG, and so the formula for conditional probability gives the answer 1/3. All that is fine. Now suppose you are told instead, “I have two children, and at least one of them is a boy born on Tuesday.” Assume that the seven days of the week are equally likely for births (again not true, but let’s ignore that), and that gender and birth day are independent (I have no idea about the truth of this, but suspect it is false too), the same calculation of conditional probabilities shows that the probability that both children are boys is 13/27.
But why is it not 1/3? After all, the same calculation would apply no matter what birth day was given; and the information seems to be irrelevant to the question posed.
I don’t have any way of making this answer seem obvious, or even plausible. Can anybody suggest one?
The only suggestion is that, as I made it clear, the Monty Hall problem refers to a one-off event, and these are the hardest to think about probabilistically. Any bias that Monty has in choosing a door only comes into play in a long sequence of plays of the game.
Well-known examples in conditional probability give estimates of the probability that someone has a rare disease given that they have just tested positive for it (remember that no test is 100% reliable). No problem with that; but if I have just taken the test, the reasoning seems less satisfactory. For a start, I have more information about myself than about a random member of the population; I know, for example, that in the last year I have had several unexplained headaches, or …
In the “boy born on Tuesday” problem, it seems much more obvious that a test of many cases would agree approximately with the answer 13/27, than that the answer is right in the unique case that confronts us. The test would run as follows. A large number of parents would be chosen. Any who could not truthfully make the statement would be rejected. Of those who could, we would certainly expect that about 13/27 would indeed have two boys. However, in an individual case, we are tempted to think that the information about birthday was thrown in gratuitously and should have no effect on the answer.
For example, suppose that the protocol was as follows. A large set of parents is chosen; only those who have two children, at least one of which is a boy, are retained. One such parent is chosen at random and instructed to make the following statement “I have two children, and at least one is a boy born on X day”, where, if they have just one boy, then X day is the boy’s birth day, while if they have two, they are to choose one by any means at all. For this formulation, the answer is 1/3.
In his 1905 paper which won him the Nobel prize and was one of the foundation documents of quantum theory, Einstein said,
In calculating entropy by molecular-theoretic methods, the word “probability” is often used in a sense differing from the way the word is defined in probability theory. In particular, “cases of equal probability” are often hypothetically stipulated when the theoretical methods employed are definite enough to permit a deduction rather than a stipulation.
In other words, “Don’t assume that all outcomes are equally likely, especially if you are given enough information to calculate what their probabilities really are”. But how does that principle apply in this case?