A recent posting by Alex Bellos has made me puzzle, once again, about probability.

I spent about five years teaching probability to first-year undergraduates. This had one effect on my thinking: it converted me into a covert Bayesian. I would say that all probability is conditional probabiity, conditional on everything that I know at the moment. If I say that the probability that a fair coin comes down heads is 1/2, this is because I have no further information either way; if I know something about the person tossing the coin, the situation of the coin toss, or anything else, I may revise my view.

But I have no satisfactory answer to the question “What is probability?”, and my recurring nightmare at that time was that a student would ask me that question and I would be unable to answer it.

The Monty Hall problem has been back in the news at present, partly because an entire book about it has just appeared: The Monty Hall Problem: The Remarkable Story of Math’s Most Contentious Brain Teaser by Jason Rosenhouse. (I don’t agree with the subtitle, but let that pass.) I never had any difficulty with the Monty Hall problem: the right answer seems obvious to me.

To recall: Monty Hall, a game show host, shows you three doors. Behind one door is a car, behind the other two are goats. Assume that you want to get the car and have no interest in goats. Monty invites you to choose a door. Then he opens another door, and shows you a goat; he asks you if you want to stick with your original choice or switch to the other door. Clearly, if you switch, you double your chances of winning the car. (Why is it clear? The chance that the car is behind the door you first choose is 1/3, and because Monty acts in such a way that you gain no information about the door you chose, this is unchanged; and after he opens a door, you know that the car is behind either the door you chose, or the remaining door.)

Purists will (and do) object to my statement of the problem. I should specify the algorithm that Monty uses to choose a door to open. I don’t think so. Monty is the host; he knows where the car is, and has no intention of showing you a car, or of giving you any information about where the car is beyond the fact that it is not behind the door he opened. Moreover, you have never been on (or even watched) the show before. It is always possible for Monty to open a door and show you a goat, and there is no doubt that he will do that. The condition about your ignorance is inserted because, if Monty had even a small bias towards one particular door (say the leftmost one) when he has a choice, then your knowledge of this would affect your calculation of the probabilities.

But this example described by Alex has me flummoxed. First, the background. If somebody says to you, “I have two children, and (at least) one is a boy,” what is the probability that both children are boys? Assuming that boys and girls are equally likely (not quite true, but let’s ignore that), the four possible combinations in birth order (BB, BG, GB and GG) are equally likely; the given information rules out GG, and so the formula for conditional probability gives the answer 1/3. All that is fine. Now suppose you are told instead, “I have two children, and at least one of them is a boy born on Tuesday.” Assume that the seven days of the week are equally likely for births (again not true, but let’s ignore that), and that gender and birth day are independent (I have no idea about the truth of this, but suspect it is false too), the same calculation of conditional probabilities shows that the probability that both children are boys is 13/27.

But why is it not 1/3? After all, the same calculation would apply no matter what birth day was given; and the information seems to be irrelevant to the question posed.

I don’t have any way of making this answer seem obvious, or even plausible. Can anybody suggest one?

The only suggestion is that, as I made it clear, the Monty Hall problem refers to a one-off event, and these are the hardest to think about probabilistically. Any bias that Monty has in choosing a door only comes into play in a long sequence of plays of the game.

Well-known examples in conditional probability give estimates of the probability that someone has a rare disease given that they have just tested positive for it (remember that no test is 100% reliable). No problem with that; but if I have just taken the test, the reasoning seems less satisfactory. For a start, I have more information about myself than about a random member of the population; I know, for example, that in the last year I have had several unexplained headaches, or …

In the “boy born on Tuesday” problem, it seems much more obvious that a test of many cases would agree approximately with the answer 13/27, than that the answer is right in the unique case that confronts us. The test would run as follows. A large number of parents would be chosen. Any who could not truthfully make the statement would be rejected. Of those who could, we would certainly expect that about 13/27 would indeed have two boys. However, in an individual case, we are tempted to think that the information about birthday was thrown in gratuitously and should have no effect on the answer.

For example, suppose that the protocol was as follows. A large set of parents is chosen; only those who have two children, at least one of which is a boy, are retained. One such parent is chosen at random and instructed to make the following statement “I have two children, and at least one is a boy born on X day”, where, if they have just one boy, then X day is the boy’s birth day, while if they have two, they are to choose one by any means at all. For this formulation, the answer is 1/3.

In his 1905 paper which won him the Nobel prize and was one of the foundation documents of quantum theory, Einstein said,

In calculating entropy by molecular-theoretic methods, the word “probability” is often used in a sense differing from the way the word is defined in probability theory. In particular, “cases of equal probability” are often hypothetically stipulated when the theoretical methods employed are definite enough to permit a deduction rather than a stipulation.

In other words, “Don’t assume that all outcomes are equally likely, especially if you are given enough information to calculate what their probabilities really are”. But how does that principle apply in this case?

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in exposition, Uncategorized. Bookmark the permalink.

30 Responses to Probability

  1. Pingback: Tweets that mention Probability « Peter Cameron's Blog --

  2. Olof Sisask says:

    About the two-children B/G problem: I think it seems somewhat intuitive that the probability that you have two boys increases from 1/3, for the following reason. If you have a boy and a girl, then it’s quite unlikely that the boy was born on a Tuesday; it is much more likely to happen if both the children are boys. So if you know you’re in a setup where you have a boy that was born on a Tuesday then it’s quite unlikely that you have a boy and a girl. In other words, you need to have two boys for it to be reasonable that the unlikely event that one was born on a Tuesday occurs.

  3. Olof Sisask says:

    (Obviously I’m using the words ‘unlikely’ etc. in an informal way above, for illustration; 13/27 is still less than a half!)

  4. John Faben says:

    I think most of the explanation for this comes in the penultimate paragraph, where you discuss the process the person used to produce their statement.

    I think that the reason 13/27 doesn’t seem obvious is because it isn’t really very interesting – without having some information about the algorithm used to generate the statement, we don’t know what effect it should have on our conditional probability, and the algorithm “if I can truthfully make the statement ‘I have two children, at least one of whom is a boy born on a Tuesday’, do so, otherwise stay quiet” just isn’t a very good model of how people decide what to say.

  5. Mathematically the conditional probability that someone has two boys, given that one of their two children is a boy born on Tuesday, is 13/27. If we started defining the probability of some event in terms of the algorithm that led to the statement being made, all our textbooks would need to be rewritten from the ground up. I think the best way to proceed is to say, we do know how to calculate probability (and how to interpret it), but this requires careful thought, and sometimes our intuition lets us down.

  6. Pingback: Randomness in Nature « Combinatorics and more

  7. Ted Jones says:

    Thanks for making me aware of the wonderful “Tuesday” example. I’m not sure I understand what you mean in your remark about rewriting of textbooks, but to me considering the process by which the parent’s statement came to be is essential for understanding.

    I would formulate it this way: we have four independent random variables s1,s2,d1,d2, the sexes and days of birth of the two children. In addition we have the parent’s statement, which in the example is that X=(s1, s2, d1, d2) belongs to a set (at least one Tuesday boy) that I’ll call E0. Now, assume the parent’s statement is required to be true; if not, the problem disappears. There is an event {not E0} of positive probability on which the parent would have had to make a different statement, or none at all. I’ll assume “none at all” can’t occur. Allowing it is tantamount to rejecting the assumption that the sexes and days are equally likely, because–conditional on the parent having made a statement–they may not be.

    We can formalize the parent’s statement as a random variable A whose value is a subset of the 196 possible sex/day 4-tuples. There is no reason why the parent’s choice has to be deterministic, but following John Farben’s lead, let’s say it is, so A is a function of X. The loss of generality is not important for my point. So we have A=f(X), with the “truth” property that f(X) must contain X.

    The event we observed is not, in fact, E0, but instead {X:f(X)=E0}. This is equal to E0 if and only if f(X)=E0 for all X in E0, i.e., iff the parent makes the given statement E0 whenever it is possible to do so. In that case, P(BB|A=E0) = P(BB|E0) = 13/27 is correct. But with a different function f, the event conditioned on is different. For a parent who (given at least one boy), mentions “boy” plus the earliest day of the week (ordered Sun..Sat) on which a boy was born, I get 9/23 (hope that’s right). For a parent who avoids mentioning Tuesday boys if any other statement is possible, the probability is 1.

    The day of the week does seem irrelevant at first glance, but whether it actually is irrelevant depends on the parent. A (nondeterministic) parent-rule I find natural is that the parent picks a child at random, and says there is at least one of whatever category the child falls into. The conditional probability of BB is 1/2 whether or not the day is included.

    I believe the paradox results from computing the conditional probability assuming a version of f(X) different from the one implicit in our intuitive judgment that the day of the week is irrelevant. That’s a psychological rather than a mathematical conclusion, of course. What do you think of it?

  8. Bob Walters says:

    I agree with John Faben that the calculation of probability requires information about the algorithm involved.

  9. Sorry to have to disagree…

    Any calculation in probability can be done unambiguously by the rules (based on Kolmogorov’s axioms) provided we specify carefully what is the “sample space” and what is the probability measure on the sample space. If you start bringing in other factors like the algorithm used to generate a statement then all you are doing is changing the measure. That is why I said in my example that I am a covert Bayesian. I happen to think that the probability measure that applies in a given situation depends on everything I know about the situation (which may include information about the algorithm used to generate some statement).

    In the Tuesday boy problem, after we know that the parent has two children, we have a sample space made up of 196 equally-likely combinations. (Arguably the correct sample space is much more fine-grained than this, but leave this aside for the moment). The statement reduces us to a set of 27, and the induced probability of these make them all equally likely; in 13 cases is it true that both children are boys.

    If I knew that a different algorithm was being used, that would (by an application of Bayes’ theorem) give me a different measure, and the calculation would give a different answer. But if you want a different answer to this question, you had better make this algorithm public, so that I can use this knowledge to adjust the probabilities.

    I am reminded of one of the questions that reached me after my appearance on the Horizon programme on infinity. Someone said, if I stayed a night in the infinite hotel, all rooms of which were full, and the next morning the other guests left, then this would demonstrate that infinity minus infinity is one. This is wrong because the history of the construction of an infinite set shouldn’t affect calculations of its cardinality. I think a similar principle applies here.

    Finally, I am not violating Einstein’s requirement. The 196 combinations are all equally likely (at least to the sort of approximation we are using here), and if a statement is made to me I use this to recalculate the probabilities using Bayes’ theorem. I do not assume a priori that the 27 combinations remaining after the statement is made are equally likely.

  10. John Faben says:

    Ok, so I’m 100% in agreement that the conditional probability of a person having 2 boys given that they have two children one of whom is a boy born on a Tuesday is 13/27. (similarly, if we are told that their eldest child is a boy, the probability goes to 50% – a situation where the intuition is perhaps easier to see?).

    However, I’m not entirely sure that I agree that this probability is the same as the probability that someone has 2 boys given that they *say* “I have two children, one of whom is a boy born on a Tuesday”. In fact, it certainly isn’t – real people just don’t decide what to say by generating statements at random and saying them if they’re true.

    I’m maybe just arguing semantics, but I think there’s a genuine point here. It helps to explain why the “born on a Tuesday” clause feels as though it shouldn’t give any information – in a real conversation, it actually wouldn’t.

  11. Fair point.

    What you are saying is that from the information that someone makes this statement, you can infer nothing. The person may be lying, or might be telling the truth in such a way as to mislead, or simply confused. As Montaigne said, “If, like the truth, falsehood had only one face, we should know better where we are, for we should then take the opposite of what a liar said to be the truth. But the opposite of the truth has a hundred thousand shapes and a limitless field.” Or Anthony Kenny, “All worthwhile philosophical statements express an insight; and the opposite of an insight is not a contradictory sentence, but a muddle.” I don’t think probability theory can handle this complication!

  12. Ted Jones says:

    My last comment was held in moderation for a while, and I don’t know if John has seen it yet.

    Peter, I’m with John on this, and I don’t think you’re quite following our argument. As I think I showed in my comment, it’s perfectly possible to discuss the distinction John makes using the sample space with 196 equiprobable points, and the mathematical definition of conditional probability with no extraneous factors; the “algorithm” is represented by a random variable defined on the 196-point sample space. And allowing the parent’s statement to be untrue is not necessary to get a conditional probability different from 13/27.

    At least if you accept my version of things, 13/27 is the correct answer only if the parent mentions a boy born on Tuesday whenever it is possible to do so (truthfully). However, you can’t exclude the cases where the parent has to make a different statement; they account for more than half the 196 points. In those cases, the statement must be different.

    So, what will the parent say when there is no Tuesday boy? Any truthful statement will do, but I haven’t been able to come up with an algorithm (or, more precisely, a random variable mapping the samples space into its power set) that both (1) is constrained as needed to get 13/27 and (2) corresponds to behavior that I find at all natural in the cases where there is no boy born on Tuesday.

    For example, if there are two girls both born on Tuesday, it seems to me that the parent should say there is at least one girl born on a Tuesday. Saying instead something like “I have one child born on either a Tue, Thu, or Sat, and one child–not necessarily the same–who is a girl” is true, and therefore possible, but it seems perverse to assume the problem contemplates a parent who might actually say that.

    So, ok, for two girls born on Tuesday the answer should be “at least one girl born on a Tuesday”. But then what does the parent say when there are a boy and a girl, both born on Tuesday? Any choice seems to violate some sort of symmetry.

    It’s true that to calculate the conditional probability you don’t need to know more about the parent’s statement than the event on which the parent will say “at least one boy born on Tuesday”. And, if you choose to do so, you can make the assumption needed to get 13/27 as the answer. It’s just that if I think about it, that assumption leads to some strange conclusions about the parent that I don’t think should be taken for granted as necessarily following from the statement of the problem.

    With my proposed parent-rule of choosing a child at random and say at least one of that category (which does require expanding the sample space to 2*196 points, but I don’t see that as mattering for understanding), the day of the week does become uninformative as intuition suggests, and it doesn’t lead to any implausible or asymmetric choices for the parent.

    I think my child-at-random rule is much more faithful to what a reader would intuitively expect the parent to do. The other assumption becomes “natural” only when you sit down to compute the probability, and then only if you don’t think about the “outside” cases.

    If you can persuade me that there is a way a parent to give answers leading to 13/27 without “strange” behavior in other cases, please do. It would make this an even better problem.

  13. Ted Jones says:

    Or, here’s another way to look at it. Same setup, but the parent tells you there is at least one girl born on Tuesday, and the question is what is the probability that there are two girls? If you assume the behavior that is required to get 13/27 in the boy case, then in the case where there are a boy and a girl, both born on Tuesday, the parent has to mention the boy. The result is that the answer to the “girl” version of the question cannot be 13/27.

  14. Sorry about the delay – it is exam time, and service will be a bit slower for a while…

    I think I stick to my position (though I am not quite sure). Allowing a statement to be a random variable is, I think, covered by my saying that “The person may be lying, or might be telling the truth in such a way as to mislead, or simply confused.” I don’t think I know any probability textbook that defines conditional probability P(A|B) when B is a random event chosen from some probability distribution possibly different from the given one. I think that by saving our intuition on this problem you run the risk of horribly confusing students struggling with conditional probability (unfortunately). But I entirely agree with expanding the notion in this way in a second course on probability and showing the students how various paradoxes can be avoided.

    Trouble is, it makes actual computations of conditional probability impossible without a great deal of extra information.

  15. Bob Walters says:

    Referring to Peter Cameron’s comment of 11:50, 10 May 2010:
    There is a sleight of hand in the statement “If you start bringing in other factors like the algorithm used to generate a statement then all you are doing is changing the measure”. The algorithm may be needed to *determine* the measure space; different algorithms determine different measure spaces.

  16. Let me try another example, a standard one in elementary probability. There is a screening test for a rare disease (prevalence 1 in 50 of the population). The test has a 5% probability of false negatives and 10% of false positives. I have just taken the test, and the result was positive. What is the probability that I have the disease? A routine calculation using the Theorem of Total Probability and Bayes’ Theorem gives the result of 16.4% (which still comes as a surprise to many people). However, should we take the following into consideration:

    • Maybe the test was part of a screening of the whole population; or maybe I took it because I had been sent by my GP, or because I was worried about certain symptoms.
    • In the third case, the symptoms may be indicative of the disease (more or less), or completely irrelevant.
    • I am a hypochondriac, so I am going to assume the worst.

    Given this, it seems difficult or impossible to put a number on the probability. The first point has something to do with the algorithm or protocol used, but it is difficult to say that the others are!

  17. Ted Jones says:

    Peter, your point of view is still mysterious to me, but I’d like to change the subject back to the Monty Hall problem. Before I do I’d like to say I agree with you that this whole area of discussion is something with a lot of potential to confuse students, and I wouldn’t bring it up in an introductory course. It’s really only natural to think this way when trying to resolve paradoxes. I’d avoid them too, with beginners.

    About Monty, you say “clearly” you should switch. I agree that you should, of course, but the history of the problem is that a large number of readers wrote in to Parade Magazine to “correct” that conclusion after it was presented there. Many of them claimed to have PhDs–some in mathematics! In fact, my own initial knee-jerk reaction before giving it any thought was that it probably didn’t matter if you switch or not.

    Why was that? Here’s my explanation. Consider a variant of the problem that I’ll call “ignorant Monty”. Ignorant Monty has no idea where the car is. He picks one of the two doors not chosen by the player at random, with probability 1/2 for each, and has it opened. He might reveal the car; then the player loses immediately. But if there are goats behind ignorant Monty’s door, the game continues. Does the player still benefit from switching?

    For ignorant Monty, the answer given by the irate Parade readers is correct: it doesn’t matter whether the player switches or not. I encourage anyone who doubts this to perform a computer simulation and see how it works out (I did, out of respect for this problem’s history).

    My theory about the reader outrage is that the ignorant Monty represents a kind of situation that scientists are likely to think about much more often than the average person. A PhD scientist might well “know” the answer–and after all, it’s “obvious” that Monty’s algorithm makes no difference. In fact, I’d go farther: it’s beyond obvious. I bet very few of those providing purported corrections had even thought about it explicitly, just as few of them thought about the color of Monty’s eyes and reached an explicit conclusion that it didn’t matter.

    I tried to discuss ignorant Monty with several acquaintances back at the time of the Parade fiasco. These were people who were not mathematically ignorant, but were sure switching didn’t matter, and I tried to say, see, here’s the assumption you’re making that causes you to think that. There was a complete failure of communication in every case (I hadn’t thought of encouraging a simulation at the time). I guess I’m still trying to share the enlightenment all these years later.

    • Ted, thanks for this and for all your comments! I started this post by saying that I don’t understand what probability is – I think the debate has conclusively proved that!

      Let me try to add a very small clarification. I think you have the right approach: in thinking about conditional probability P(A|B), we need B to be more general than a fixed event, and even more general than an algorithm for choosing an event: I think a probability distribution on the space of events might be the way to go. In real life this would reflect your beliefs about how the condition B was come up with.

      In the Monty Hall problem, as I said, I assumed that the statement “Monty knows where the car is and he is giving nothing away beyond opening one door containing a goat” was part of the specification. This is not an algorithm, but it needs to be justified by giving an algorithm. I considered “Ignorant Monty” to be a different problem. As so very often, the difficulty lies in the specification of the problem.

      Sorry this is brief; I have an office hour about to begin, and in exam time it is likely to be well patronised…

      • Ted Jones says:

        Thank you, too, Peter. You’re exactly right that Monty and ignorant Monty are different problems. My point is only that I think many of those who offered the incorrect “corrections” failed to understand that, and wrote in with the right answer to the wrong problem.

        I’m not taking the approach you suggest of conditioning on anything other than a fixed event. Rather I (we) say that the two different problems correspond to two different sample spaces, where “sample space” is understood to include the probability measure and not just the underlying set.

        For Monty vs ignorant Monty, we can assume without loss of generality that the player chooses door 1. Let C and M be the door with the car and the door Monty chooses. Consider the event E that {M=2 & C \ne 2}, and the conditional probability P(C=1|E).

        For the sample space let’s use the set of pairs (c,m) of possible values that (C,M) may take on. For simplicity, I’ll omit pairs with m=1, which would have probability zero with either version of Monty. Then the points in the sample space are:

        (1,2), (1,3), (2,2), (2,3), (3,2), (3,3)

        C=1 is the event {(1,2),(1,3)}, and E is the event {(1,2),(3,2)}.

        For regular Monty, assuming for simplicity that he tosses a coin when he can choose either door, the probabilities of the sample points are: 1/6, 1/6, 0, 1/3, 0, 1/3. For the different problem of ignorant Monty, we have instead: 1/6, 1/6, 1/6, 1/6, 1/6, 1/6.

        Now the conditional probabilities are cut and dried. Regular Monty:

        P(C=1|E) = P(C=1 & E)/P(E) = P{(1,2)}/P{(1,2),(3,2)} = (1/6)/(1/6+1/3) = 1/3,

        and the player should switch.

        Ignorant Monty:

        and it doesn’t matter.

        Is this explanation any help at all?

      • Ted Jones says:

        I see I managed to get regular Monty’s probabilities in the wrong order; of course I should have said 1/6, 1/6, 0, 1/3, 1/3, 0. I blame your other posts about the symmetric groups. Yes, that was it…

  18. This week’s New Scientist has an article by Alex Bellos about the recent Gathering for Gardner. He begins the article with the boy born on Tuesday. On trying to explain it to my son over breakfast, I found that I understood it better myself (psychologically rather than mathematically).

    Suppose that someone says to you, “I have two children; the elder is a boy”. On the basis of that information, the probability that both children are boys is 1/2. Now we can add a couple of things:

    • If the statement was “I have two children; the elder is a boy born on Tuesday”, the probability is still 1/2; the extra information really is irrelevant.
    • The same would be true if one child were identified in any other way; for example, “I have two children; the taller one is a boy.”

    In particular, if you were told “I have two children; the one born on Tuesday is a boy”, then the probability of two boys is 1/2, since the statement is phrased in such a way as to identify one of the children.

    Now the statement “I have two children; one is a boy born on Tuesday” doesn’t absolutely identify one child, but is very likely to do so, since the probability that both are boys born on Tuesday is small. So we’d expect that the probability that both are boys would be closer to 1/2 than if the information about Tuesday were omitted, as indeed it is. The more unlikely it is that both children satisfy the condition, the nearer the description comes to giving a definite identification of one child, and the closer the probability comes to 1/2.

    The purpose of the apparently irrelevant information about Tuesday is to give an identification of one of the children which works with fairly high probability (26/27 in this case). If this happens, then the probability that both children are boys is 6/13; if it fails, the probability is 1. Now a simple calculation gives the answer 13/27.

    Incidentally, I have to take Alex gently to task. He says in the article, “If you have two children, and one is a boy, the probability of having two boys is significantly different if you supply the extra information that the boy was born on Tuesday.” But saying “the boy” implies that the child has been identified, and the paradox would evaporate!

  19. David Bedford says:

    Hi Peter,

    I’ve recently come to the same conclusion as you about why this is intuitive after all. Firstly I sidestep much of the discussion about how and why the information is revealed by phoning a complete stranger and asking two questions. The first is Do you have two children? – Answer yes. The second is “Is one of your children a boy with property P?” If the answer is yes then the probability that both children are boys varies between 1/3 and 1/2 depending on how likey it is that there could be two sons both with property P. If P is being the eldest then it is 1/2. If P is satisfied by all boys then it is 1/3. Born on a Tuesday, Born on Feb 28th, Born on Feb 29th (!) all bring the probability nearer 1/2 because it is more and more likely that the respondent is talking about a particular child. I’m not even sure it matters whether the answer is correct as long as the respondent believes it is correct and hence was thinking about a particular child when answering.

  20. seancarmody says:

    I’m not sure whether my earlier comment came through (there could have been a cookie and/or Javascript problem). In case it did not, here it is again (this time with errors corrected!).

    Peter, in one of your comments you write:

    If I knew that a different algorithm was being used, that would (by an application of Bayes’ theorem) give me a different measure, and the calculation would give a different answer. But if you want a different answer to this question, you had better make this algorithm public, so that I can use this knowledge to adjust the probabilities.

    The implication here is that some kind of minimal interpretation of the information contained in someone saying “I have two children, and (at least) one is a boy” is precisely the information in the formal mathematical event represented by the subset {BB, BG, GB} of the sample space {GG, BB, BG, GB}. I’m not sure that’s necessarily the best interpretation.

    To explain why, I’ll stick with this simpler problem rather than the Tuesday version, but the same argument applies there. I will have to assume that we are working with a bigger state space, but that doesn’t mean I have to start worrying about whether the person is lying or the probability that they would have said something in Klingon instead. I won’t say anything yet about this bigger space other than the fact that it includes events BB, BG, etc, each of which has probability 1/4.

    I’ll denote by M the “mathematically formal” event represented by {BB, BG, GB}–strictly speaking this means the union of the events BB, BG, etc in the bigger space–and H the event that a “human” tells you “I have two children, and (at least) one is a boy”. The probabilities we are interested in are P(BB | M) and P(BB | H).

    In both cases, we can appeal to Bayes theorem and so have

    P(BB | M) = P(BB) P(M | BB) / P(M)


    P(BB | H) = P(BB) P(H | BB) / P(H)

    Now P(M) = 3/4 and P(M | BB) = 1 and so P(BB | M) = 1/3, which is the classical answer. No surprise there. What about the case for P(BB | H)? To have it line up with the M case, we have to accept that P(H) = 3/4 is a minimal, neutral probability to assign to the event that someone tells you “I have two children, and (at least) one is a boy”. Even without worrying about complex or far-fetched algorithms, that doesn’t seem quite right.

    I would argue that a more reasonable probability would be 1/2. One algorithm that would give this outcome is that a person with BB would say “I have two children, and (at least) one is a boy”, a person with GG would say “I have two children, and (at least) one is a girl” and a person with BG or GB would say “I have two children, and (at least) one is a boy” or “I have two children, and (at least) one is a boy” at random, each with probability 1/2. In this case P(H) = 1/2, which is evident if we calcalate P(H) = P(H | BB)P(BB) + P(H | GG)P(GG)+…
    Also, since P(H | BB) = 1, we get P(BB | H).

    An important point to emphasise here is that I don’t really have to assume I know precisely what algorithm is being used. Rather, I need to be able to come up with the probabilities P(H) and P(H | BB). To me, values of 1/2 and 1 seem like very reasonable minimal interpretations of this information.

    What does all this mean for the Tuesday problem? Here we can again distinguish the mathematical event M and the human utterance event H for the statement “I have two children, and at least one of them is a boy born on Tuesday”. P(BB | M) = 13/27 for the reasons discussed in the post, but in the case of the human utterance, I think you can quite reasonably assign P(H | BB) = 1/7 (all days are equally likely to have been spoken) and P(H) = 1/14 (the day and the gender are independent), which means that P(BB | H) = 1/2!

  21. Pingback: Probability Paradoxes | Stubborn Mule

  22. seancarmody says:


    I have expanded on my argument as to why the probability should be 1/2 if the information is volunteered by the father and 13/27 if you ask the question ‘do you have at least one boy born on Tuesday’ and receive a ‘yes’. I think the two scenarios reveal different information. Any thoughts you (or others) have would be greatly appreciated!

  23. Sam says:

    The answer is 1/3.

    The trick is based on the indeterminate nature within the BB case – which boy did he refer to? You must not double count the distinct cases.

    There are only 7 distinct BB cases, not 13 (or 14). There are 7 BG cases and 7 GB cases. So the correct denominator is 21.

    7/21 = 1/3.

    Common sense should tell you that the Tuesday fact has no bearing on the conditional probability in question.

    To look at it another way, each of the following cases is possible for the second child:

    boy, girl, girl

    And each of these cases is equally likely at 1/3.

    In the BG and GB case, you know for a fact that the B is the one mentioned, the one who happens to be born on Tuesday. Well it’s exactly the same in the BB case – one of the B’s is the one mentioned, the other is a possible boy you know nothing about, except that the probability of him existing is 1/3.

  24. Sean Carmody says:

    Sam: you say

    Common sense should tell you that the Tuesday fact has no bearing on the conditional probability in question.

    If common sense was all that was required, I am sure that this puzzle would not be so controversial!

    You are simply asserting that the possibilities for the ‘other’ child are boy, girl, girl, each with probability 1/3 and therefore the probability of two boys is 1/3. Other than an appeal to common sense, how would you differentiate this argument from someone who says that there are only two possible genders for the ‘other’ child, boy or girl, each with probability 1/2 and therefore the probability of two boys is 1/2? While I don’t agree with that particular argument, it is no worse than yours.

  25. ales says:

    hello, and what about this example. Anyone knows solution?

    Assume you have an algorithm which errs with a probability of at
    most 1/4 and that you run the algorithm k times and output the majority output.
    Derive a bound on the error probability as a function of k. Do a precise calculation
    for k = 2 and k = 3, and give a bound for large k. Finally, determine k such that the
    error probability is less than a given “epsilon”

    Thanks, Ales

  26. Pingback: 2010 in review « Peter Cameron's Blog

  27. margazhi mama says:

    “boy born on Tuesday problem”: see Kai Lai Chung Elementary Probability Theory with Stochastic Processes Springer International Student Edn. Ch. 5.1 Example 5 pp. 115-6 (I have the Indian reprint and my laptop can’t handle the Greek and other symbols, so I give only as much of the text as I can).

    Consider all families with 2 children and assume that boys and girls are equally likely. Thus the sample space may be denoted schematically by 4 points:
    {(bb), (bg), (gb), (gg)}
    The order in each pair is the order of birth, and the 4 points have probablility ¼ each. If a family is chosen at random, and found to have a boy in it, what is the probability that it is of the type (bb)?
    Let us put
    A = {w / there is a boy in w}
    B= (w / there are 2 boys in w}
    Then B is contained in A so AB = B, thus
    P (B/A) = P (B) divided by P (A) = ¼ divided by ¾ = 1/3

    But now ask a similar sounding but really different question. If a child is chosen from these families and is found to be a boy, what is the probability that the other child in his family is also a boy? This time the appropriate representation of the sample space would be
    {gg, gb, bg, bb} (second letter of each is a subscript)
    where the sample points are not families, but the children of these families, and gg = a girl who has a sister, etc. Now we have:
    C = (w / w is a boy)
    D = {w / w has a brother}
    so that
    CD = { w / w = bb}
    P (D/C) = P (CD) divided by P (C) = ¼ divided by ½ = 1/2

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.