## Note on probability

A bag contains two red, three green and five blue balls. You draw two balls in order without replacement.

• What is the probability that the first ball is red?
• What is the probability that the second ball is red?
• What is the expected number of red balls in your sample?
• This is an easy question, but exhibits two important things about probability which I still find slightly miraculous.

The answer to the first question is obviously 2/10 = 1/5.

The second is a bit more complicated. If the first ball was red, then only one of the nine balls remaining is red; if the second ball was not red, then two of the nine remaining are red. So by the Theorem of Total Probability, the answer is

(1/5)(1/9)+(4/5)(2/9) = (1+8)/45 = 1/5,

the same as the answer to the first question. Could this have been expected? Of course, “by symmetry” (where the symmetry here consists of letting time run backwards and choosing the second ball first).

I have to admit I don’t have a simple convincing argument for the validity of this symmetry principle, though there is no doubt that it is correct.

For the third, we use the almost miraculous property of expectation, namely that it is additive. Questions of independence can be simply ignored. The expected number of red balls chosen on each draw is 1/5, so the expected number is the sample is simply (1/5)+(1/5) = 2/5.

I first thought this through when I had to teach first-year probability some time ago. I was reminded of it recently by a news item in Significance on research by the American Institute of Physics, putting into context the observation that one-third of US physics departments are all-male.

American physics departments are typically small. The sizes of BSc-awarding departments range from 1 to 27 professors with a median of 4; for PhD-awarding departments, the range is 3 to 75 with median 22. Of all physics professors, 13% are female.

Taking the sizes of the physics departments as given, and modelling the gender balance of each of a binomial distribution, they found that the expected proportion of all-male departments to be even higher than 1/3, suggesting that some affirmative action is already taking place.

At first, I wondered if the approach was too simplistic; shouldn’t they have treated the departments all together rather than individually, or used a hypergeometric rather than a binomial distribution? Well, the symmetry principle and the additivity of expectation show that in fact the approach is valid for finding the expected number of all-male departments, assuming only that each department is such a small fraction of the total that the binomial approximation to the hypergeometric is valid.

Of course, something more complicated would be required to find a confidence interval for the expected number, as would be necessary to investigate whether there really was some hiring bias going on.

## About Peter Cameron

I count all the things that need to be counted.
This entry was posted in exposition, maybe politics and tagged , , , . Bookmark the permalink.

### 3 Responses to Note on probability

1. James says:

Even simpler than symmetry between forward time and reverse time is symmetry between the balls. Nothing in the procedure breaks the symmetry between the different balls, so each of the ten balls is equally likely to be the second one chosen – how could any one be more likely than any other? Seen this way, I think the second question becomes just as “obvious” as the first.

• Obvious to you and me, but I have found it is not so obvious to first-year undergraduates.

2. James says:

“Obvious” was your word. I didn’t say the second one was obvious – I said that, seen in this way, it becomes as obvious as the first one. Is the first one obvious? Who can say. Is it obvious to “see it in this way”? Probably not.

Anyway, do you agree that the symmetry between balls, rather than the symmetry between first choice and second choice, is the clearest one here? That was the point I was actually trying to make