As promised, here is some discussion of Donald Preece’s puzzle.
I first met Donald at the British Combinatorial Conference in Aberystwyth in 1973; we sat together on the coach on the excursion. At the time, he worked in horticultural research. To simplify considerably, suppose you are designing an experiment on apple trees. You have to take account of the spatial layout of the trees, but also potentially the fact that the trees might have been used for an experiment the previous year, which might have some carry-over effect. (Think about Diamond Geezer’s puzzle; if the trees are in a 4×4 square, and last year’s four treatments are represented by the suits, then this year’s could be the values.)
Donald had lectured about things related to the puzzle I described in the previous post. I had been thinking about similar matters, in connection with doubly transitive permutation groups, and I had a family of designs which could be thought of as corresponding to a pack of 96 cards, with six suits and sixteen values; but, for what I was doing, it was not necessary for me to arrange them in a rectangle. I gave Donald the designs, he arranged them in a rectangle, and the result was our first joint paper.
However, I didn’t really understand at the time what he was doing, and it took me a quarter of a century to figure it out. (Donald has an amazing talent for coming up with designs of this kind, but is much less good at explaining exactly what properties they have.) Eventually, I produced an infinite family, generalising the construction in our joint paper, and published them in the proceedings of the 2001 BCC, which was organised in Canterbury by Donald himself; my paper was a sort of tribute to our long association.
Here I would like to explain exactly what the conditions are, and briefly decribe the examples and some open problems.
We will start with a set of kn elements, corresponding to the playing cards. A statistician would call them “experimental units” or “plots”; in the theory of buildings, in the terminology developed by Jacques Tits, they would be “chambers”. I will call them “cells”, since there will be various ways to think of the structure as an k×n rectangle. I assume that k < n. In the playing card example, k = 4 and n = 13.
On the set of cells, we have a collection of partitions, of two types:
- Type A partitions, with k parts each of size n;
- Type B partitions, with n parts each of size k.
In the example, “suits” and “rows” are Type A partitions; “values” and “columns” are Type B partitions.
The are three types of conditions that our partitions are required to satisfy. These requirements do make sense, in both the statistical and group-theoretic contexts.
Orthogonality conditions
We require that a partition of type A and a partition of type B are orthogonal, in the sense that each part of the first partition meets each part of the second in a single cell. This means that the two partitions impose an k×n rectangular structure on the set of cells. In our example, it is clear that suits and values are orthogonal, and that rows and columns are orthogonal; the orthogonality of suits anc columns is the first condition of the puzzle specification, and the orthogonality of rows and values is the second.
Balance conditions
These conditions need a bit of preparation. Given two partitions of the same type, there are two possible relationships between their parts, determined by their intersections. In our example, a part of the “suits” partition and a part of the “rows” partition meet in either three or four cells, by condition 4; and a part of the “columns” partition and one of the “values” partition meet in 0 or 1 cells, by condition 3. Pick one of the two values, and say that parts of the two partitions are “incident” if they intersect in that number of cells. (I will use the greater number here for convenience.)
In general we require that there are two possible intersection sizes, differing by one, for parts of the two partitions.
Now we require that the two partitions should form what combinatorial design theorists call a square 2-design, and statisticians a symmetric balanced incomplete-block design, or SBIBD: that is, any two parts of one partition are incident with a constant number λ of parts of the other. In the example, for suits and rows, λ = 0, since no two suits both have four occurrences in a given row. For columns and values, λ = 1, by condition 5.
The existence of a SBIBD puts restrictions on k and n. For example, a counting argument shows that
k(k−1) = (n−1)λ,
so that n−1 divides k(k−1). There are further restrictions too, such as the Bruck–Ryser–Chowla theorem. For most “admissible” pairs (n,k), we have no idea whether a design exists or not.
In this example, the column–value design has n = 13, k = 4, and λ = 1. There is a unique such design, usually referred to as the projective plane of order 3. We can identify the columns with the 1-dimensional subspaces and the values with the 2-dimensional subspaces of a 3-dimensional vector space over the field of integers mod 3.
One thing further remains to be said. Combinatorial design theorists will not like the design formed by suits and rows, since any row has three or four incidences with each suit. To make this structure resemble a traditional design, we simply remove three incidences from each pair, leaving zero or one; this was effectively what our construction achieves.
Youdenization
Consider an incidence structure (whose elements are called “points” and “lines”) with the following properties:
- there are n points and n lines;
- each point is incident with k lines, and each line with k points.
We can regard this structure as a set of size kn with two type B partitions, as follows: the cells are the “incidences”, the ordered pairs consisting of a point and a line which are incident; the first partition corresponds to the relation “same point”, and the second to the relation “same line”.
Our balance condition is satisfied if and only if this is a SBIBD or square 2-design.
Can we arrange this design in a rectangle? For this, we have to find a type A partition which is orthogonal to the point and line partitions. This can be done by a process called Youdenization, which depends on Hall’s Marriage Theorem. According to this theorem, we can find a system of distinct representatives for the lines: that is, we can find a point in each line so that the chosen points are distinct. The resulting set of n incidences forms the first part of the new partition. (Said otherwise, the chosen cells form the first row of our array.) Now we remove these incidences and continue until all the cells
have been used.
So in particular any SBIBD can be represented as a set of cells with one type A and two type B partitions, satisfying our orthogonality and balance conditions, and so can be written out in a rectangle where the columns are
the blocks and each row contains each point once.
One particularly simple case of Youdenization occurs when the design has cyclic symmetry. Write any block in the first column and obtain the other columns by shifting:
1 | 2 | 3 | 4 | 5 | 6 | 7 |
2 | 3 | 4 | 5 | 6 | 7 | 1 |
4 | 5 | 6 | 7 | 1 | 2 | 3 |
This works more generally if the design admits a regular group action, that is, if it arises from a difference set.
Youdenizing a set of more than two type B partitions is much more problematic, and indeed is not always possible, as we shall see.
Linking conditions
The last type of condition only comes into play when there are more than two partitions of the same type.
We consider the type B partitions, and look first at the abstract structure of the linkage in terms of incidence. Suppose we have three sets X,Y,Z, with an incidence relation between each pair. We assume that each pair of sets, with its incidence relation, forms a SBIBD or square 2-design with parameters (n,k,λ). We say that the three designs are linked if the following condition holds:
There are two constants a,b such that, given points in two of the sets, say x∈X and y∈Y, the number of points z∈Z incident with both x and y is
- a, if x and y are incident with one another;
- b, otherwise.
(Actually, this is the natural linking condition in the group-theoretic situation. In the statistical situation, the condition is weaker, but harder to explain, so I will stick with this one.)
Linking imposes further restrictions on the parameters of the designs in question. In particular, k−λ must be a perfect square. In fact, the only parameters for which linked designs are known to exist are of the form n = 4u2, k = 2u2±u, λ = 2u2±u, where u is a power of 2. The construction is very geometric, involving quadratic forms and Kerdock codes, and may be the subject of another post someday.
If there are more than three sets, we require an incidence relation between each pair of sets so that they form a design, and the linking condition should hold between each triple of sets. No higher linking is required. In our context of type B partitions, we require that the incidence relations naturally defined between different partitions should satisfy these conditions.
Technically, the linking condition for type A designs (if there are more than two of them) are the same. But we will only ever need this in two special cases: we have k divides n−1 in the first case and k divides n+1 in the second.
The first is the case where each set in one partition is incident with a single set of another partition, so that we have a “trivial” (k,1,0) design. In this case, incidence is a bijection between each pair of partitions. We can number the sets of the first partition 1,…,k, and use incidence to transfer the numbering to the other sets, so that between the first and any other partition two sets are incident if and only if they have the same number. (In our example, this means that, say, hearts are suit number 1 if there are four hearts in the first row, and so on.) Now the linking condition for three sets says precisely that two sets from any two partitions are incident if and only if they have the same number.
In the second case, each set in one partition is incident with all but one set in any other, and everything is as before, with the bijections defined by non-incidence rather than incidence.
Example: one type A, many type B
This is the case that Donald and I originally considered. I had the linked designs, Donald did the Youdenization. This means that we have to find a collection of sets, each a transversal for the sets X,Y,Z,… carrying the linked designs, and each consisting of mutually incident points, so that each incident pair of points from any two of the sets occurs in exactly one of these transversals. Moreover, the transversals must be partitioned into “parallel classes”, each of which has the property that it covers all the points of the sets precisely once. Then we take these transversals as the cells; the partition into parallel classes is the type A partition; and the partitions according to the points of X,Y,Z,… are the type B partitions.
Not all families of linked designs will have such transversals, and having a collection of them with all the above properties is even rarer.
This is the construction that took me 25 years to understand. I still wonder if I really do! Anyway, I was able to construct an infinite family of examples, using geometry and group theory.
Example: two or more of each type
The card trick is of this kind, with two partitions of each type. Donald, and several co-authors, separately or together, have produced many examples. Most of them have n = 2k+1, for prime values of k; but there are some with k = p+1 and n = p2+p+1, where p is prime (the card example has p = 3), and others with n = 37, k = 9. The other case is represented by examples with n = 11, k = 6.
The job here is to start with the SBIBD and to produce two different Youdenizations of it, balanced with respect to each other.
Currently Donald and JP Morgan are constructing examples with two type B partitions and more than two type A partitions; that is, several different Youdenizations, any two balanced and any three linked.
A problem
Parameters of known examples with two type B and at least two type A factors all have k dividing n±1. But parameters of known examples with more than two type B factors are completely different, and don’t satisfy this condition; indeed, the known parameters would not permit balance of the type A factors.
Is it true that there is no such arrangement with at least three factors of each type?