Mathematicians and statisticians

The Indian mathematician and statistician Raj Chandra Bose had an impressive lineage, having been taught by F. W. Levi in geometry and R. A. Fisher in statistics. He generated various stories in his lifetime, some of which were gathered in a volume of Sankhya after his death. Here is one.

There is a very famous joke about Bose’s work in Giridh. Professor Mahalanobis wanted Bose to visit the paddy fields and advise him on sampling problems for the estimation of yield of paddy. Bose did not very much like the idea, and he used to spend most of the time at home working on combinatorial problems using Galois fields. The workers of the ISI used to make a joke about this. Whenever Professor Mahalanobis asked about Bose, his secretary would say that Bose is working in fields, which kept the Professor happy.

A statistican and a mathematician discuss fields

A statistican and a mathematician discuss fields

This delightful and possibly apocryphal story (illustrated here by artist Neill Cameron) is symptomatic of a more serious issue, the misunderstandings that arise between mathematicians and statisticians. I have recently been trying to explain to mathematicians the statisticians’ view of block designs, and to commend it to them in view of the many fascinating mathematical problems it throws up. See my paper with R. A. Bailey in Surveys in Combinatorics 2009, volume 365 in the London Mathematical Society lecture notes.

In brief, many mathematicians use the term “block design” to mean a collection of k-element subsets of a set of size v, with the property that any two points are contained in exactly λ of these subsets or “blocks”. Most mathematicians know that the concept was invented by statisticians, and probably think that statisticians use five parameters to describe such a design, which must be written in a ritually correct order which is impossible to remember without simply committing it to memory.

A statistician would call such a design a “balanced incomplete-block design”, and would say that there are many block designs which are not balanced (that is, do not satisfy the condition involving λ), and indeed the circumstances of a particular experiment (the number of varieties to be tested, the number of experimental units available) usually precludes the existence of such a design. It is necessary to use the best design possible (so that estimates of treatment differences can be made as accurately as possible), and the theory of optimal designs has been developed for this purpose. (Balanced designs are known to be optimal on all criteria if they exist – this is Kiefer’s Theorem.)

But more seriously, a statistician does not primarily think of a block design as a collection of subsets of the treatment set, but rather as two partitions of the set of experimental units: one partition, into blocks, is imposed by circumstances over which the experimenter has no control, elements of a block being more alike than those in different blocks; and one partition, into treatments, being almost entirely under the experimenter’s control, and essentially constituting the “design”.

Now the mathematician’s design is obtained by identifying each block (or part of the first partition) with the set of treatments applied to units in that block. But a moment’s thought shows that the statistician’s view is much more general. It allows the possibility that the same treatment is applied to more than one unit in a block (so that the blocks are multisets rather than sets), and has no problem at all with two blocks being allocated the same treatments (what a mathematician refers to as “repeated blocks”, and tends to disallow).

This account so far seems to suggest that it is the mathematicians who are inflexible. But this is by no means the whole story. A few years ago the Royal Statistical Society had a meeting about the appropriate use of mathematics within statistics. Two of the four contributions were simply anti-mathematical rants (one, if I recall correctly, talked about the “monstrous regiment of mathematicians”). The other two offered a more thoughtful assessment: complicated abstract mathematics should not be used without good reason, but mathematical treatment has the properties of generalisation, abstraction, and (odd though it may seem) simplicity. Some algebraic calculations are really easier than a typical one of the infinitely many numerical calculations they replace!

Advertisements

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in events, mathematics and ..., Neill Cameron artwork. Bookmark the permalink.

One Response to Mathematicians and statisticians

  1. Last week I was on holiday in the vicinity of a statistics conference. I went to a few of the talks.

    One statistician said something like this:

    I proved the formula. But I didn’t trust my proof, so I did an extensive simulation, which agreed well with the formula.

    I do not think a mathematician could have said that!

    (This is not to say that mathematicians don’t check their formulae by working examples; rather that, for us, a simulation carries far less conviction than a proof.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s