Browsing through an old issue of Significance, the Royal Statistical Society glossy, I came upon an interesting article “Dicing with the unknown” by Tony O’Hagan, professor of statistics at Sheffield. It brought home to me something I hadn’t realised clearly before.
At risk of doing violence to the argument, I will start at the end. When the reslts of a scientific experiment are analysed by a frequentist statistician, they are usually given in terms of p-values or confidence intervals. O’Hagan points ot that these are often not correctly understood by the scientists:
When the null hypothesis is rejected with a p-value of 0.05, this is widely misunderstood as saying that there is only a 0.05 chance that the null hypothesis is true. If told that (3.2,5.7) is a 95% confidence interval for a certain parameter, the interpretation that there is a 95% probability that the parameter lies between 3.2 and 5.7 is extremely common.
In fact, of course, as generations of statistics lecturers have tried to explain, in the first case, there is only a 5% chance of obtaining such an extreme result if the null hypothesis is true; that is, in a long sequence of identical experiments, if the null hypothesis were true, such a result would be obtained in 5% or fewer of the experiments. Similarly, in the second, if the same experiment were repeated with different values of the parameter and identical results obtained, in only 5% of cases would the value of the parameter lie outside this interval.
O’Hagan contends that the scientist really wants to know if the new theory is true (that is, the null hypothesis is false), or the exact value of the parameter, and is looking for a probabilistic statement about this, which the frequentist statistician cannot provide. The Bayesian, on the other hand, has no difficulty. Once you have swallowed the whale of putting a prior probability on the truth of the theory, or the value of the parameter, the experiment can update that to a posterior probability reflecting the new knowledge gained from the experiment.
As I said, this is almost the conclusion of the article. (The real conclusion is that “Every statistician needs to understand the difference between the frequentist and the Bayesian theories of statistics, and every practising statistician must (at least implicitly) choose between them.”) The real point of the article is to track this difference to its origin. The paragraphs discuss
- two kinds of uncertainty: epistemic (due to lack of knowledge) and aleatory (due to randomness);
- two kinds of probability, or rather two interpretations of what probability means: frequentist (limiting freqency in many identical independent trials) and Bayesian (degree of belief);
- two kinds of statistics (as described above, really two ways of interpreting the result of an experiment);
- and finally, two kinds of statistician, who should at least understand one another’s viewpoint.
I suppose a classical physicist would go back further and say that all uncertainty is really epistemic (the coin toss is random only because I don’t have sufficiently accurate information about velocities, forces, etc.), while a quantum physicist would point to a source of real aleatory uncertainty at the microscopic level. It is hard to see how these arguments would really impinge on the debate between scientist and statistician, though.