EKR, Steiner systems, association schemes, and all that

A great number of mathematical problems amount to looking in a large but highly structured graph, and finding a complete or null subgraph of largest possible size there.

For a simple example, consider Latin squares of order n. One of these is an n×n array containing the entries 1,2,…,n, arranged in such a way that the entries in any row or any column are all distinct. So each row is a permutation of the numbers 1,2,…,n (in the nineteenth-century sense, that is, an arrangement of these numbers rather than the operation of rearranging them), and any two of the n permutations defined by all of the rows disagree in all positions. Thus, if we make a graph whose vertices are the permutations, two vertices joined if they agree in at least one position, then a Latin square is the vertex set of a null subgraph of maximum size.

I propose to speak about a class of graphs which includes some which give a context, in this way, to two big theorems of combinatorial mathematics: the construction of Steiner systems, and the Erdős–Ko–Rado theorem. Considering the whole class of graphs leads us to a conjecture which would extend (in some sense) both of these big results.

I’ll conclude with an even more general context for this, the theory of association schemes.

Steiner systems

The Lady’s and Gentleman’s Diary in 1844 published a problem which proved too difficult for its readers, and indeed was not solved for more than a century and a half. The problem was posed by the Editor, Wesley Woolhouse, and ran as follows:

Determine the number of combinations that can be made out of n symbols, p symbols in each; with this limitation, that no combination of q symbols, which may appear in any one of them, shall be repeated in any other.

We assume that q < p < n to make things interesting. But for my purposes later on, I will change notation, and use k and t instead of p and q.

There is an upper bound, C(n,t)/C(k,t), where C denotes the binomial coefficient. For the numerator is the number of subsets of size t; in a collection of combinations satisfying the problem, each combination contains C(k,t) of them, and there is no overlap. But, without giving a construction, it is not clear whether this upper bound can be attained (and indeed it cannot always be).

In the next issue, he specialised to the case k = 3, t = 2; that is, we are looking for a collection of 3-element subsets of {1,2,…,n}, with the property that no two meet in more than one point. In terms of our original formulation, if we make a graph whose vertices are the 3-element subsets, two subsets joined if they meet in two points, we are looking for the largest null subgraph of such a graph. The upper bound mentioned above reduces to n(n−1)/6, and it is easy to show that it cannot be attained unless n is congruent to 1 or 3 (mod 6). [This is an instance of what I shall call the divisibility conditions.] Thomas Penyngton Kirkman, rector of Croft in Lancashire, gave a proof that, if this is satisfied, then a set of n(n−1)/6 triples with the required property can be found. In such a system, any two points are contained in exactly one of the distinguished triples.

Such configurations are now called Steiner triple systems, even though Steiner didn’t pose the problem of their existence until six years after Kirkman had solved it! An article by Norman Biggs and Robin Wilson in Combinatorics: Ancient and Modern (Oxford University Press 2013) untangles some of the history.

Kirkman clearly found that his job as a clergyman didn’t use all his energy; as well as this, he worked on group theory, knot theory, and polyhedra, and was elected Fellow of the Royal Society in 1857.

In general, a configuration meeting the bound in Woolhouse’s original problem is called a Steiner system, and denoted by S(t,k,n). In the years from 1847 to 2014, a number of specific examples were found, none having t larger than 5. (Examples of S(5,6,12) and S(5,8,24) were found in the first half of the twentieth century; they are associated with the sporadic simple groups found by Émile Mathieu.) Then in 2014, Peter Keevash announced a proof that, for any k and t, if n satisfies the relevant divisibility conditions and n is “sufficiently large” (in terms of k and t), then a Steiner system S(t,k,n) exists.

Since Keevash’s work, at least one further proof has been given. But nobody knows exactly how large n has to be (except in special cases).

The Erdős–Ko–Rado Theorem

A collection of k-element subsets of a set of n points is called an intersecting family if any two of its members have a point in common. More generally, it is t–intersecting if any two of its members have at least t points in common.

The question, What is the largest t-intersecting family of k-subsets of an n-set?, was tackled by Paul Erdős, Chao Ko, and Richard Rado in 1938. To understand what they did, first consider how you can obtain a large family. If we fix an element of the n-set, say 1, and take all the k subsets which contain 1, then clearly we get an intersecting family of size C(n−1,k−1). Indeed, if we choose all the subsets containing all of 1,2,…,t, we obtain a t-intersecting family of size C(n−t,k−t).

EKR, as I shall call them, showed that, provided n is sufficiently large (in terms of k and t), we cannot do better. Indeed, for intersecting families, this bound holds as long as n > 2k; and, if n > 2k+1, then the only families attaining the bound are those consisting of all the k-sets containing some fixed element.

Although this theorem was proved in 1938, it was not published until 1961. Paul Erdős said subsequently,

One of the reasons for the delay was that at the time there was relatively little interest in combinatorics. Also in 1938, Ko returned to China, I went to Princeton, and Rado stayed in England.

This result can also be put into our general framework. As we did before, form the graph whose vertices are the k-subsets of our n-set, two vertices joined if they have at least t points in common. We saw that Steiner systems come from null subgraphs of largest possible size; EKR families are complete subgraphs of largest possible size.

Comparison

However, there is an interesting contrast. In the case of Steiner systems, the bound is easy to establish, but a very complicated construction is required to build systems meeting the bound. In the EKR case, construction of the large systems is easy, but the proof that no larger family is possible is what requires the work!

Another thing to notice is that the product of the two bounds is

C(n−t,k−t)·C(n,t)/C(k,t) = C(n,k)

(a short exercise with binomial coefficients). The right hand side is the number of vertices of the graph. This reminds us of a property which symmetric graphs have:

If a graph on n vertices has automorphism group acting transitively on the vertices, then the product of the sizes of the largest complete and null subgraphs does not exceed the number of vertices of the graph.

This is a slightly more difficult exercise; it uses double counting, and two facts:

if a group G acts transitively on n points, then the number of elements of G mapping x to y is |G|/n (a consequence of Lagrange’s Theorem;
a complete and a null subgraph cannot have more than one vertex in common.

Now the graph in our example has C(n,k) vertices, and the symmetric group S_n acts transitively on the vertex set. This goes some way to “explaining” the relation between the two bounds, and also suggests that algebra has some part to play here.

(This is a small plug for a new book, Erdős–Ko–Rado Theorems: Algebraic Approaches, by Chris Godsil and Karen Meagher, recently published in the Cambridge Advanced Mathematics series.)

Thus Keevash’s Theorem says that, for this graph, if n is sufficiently large in terms of k and t, then the product of the sizes of the largest complete and null subgraphs of the graph is equal to the number of vertices if and only if the divisibility conditions are satisfied.

Generalisation

We can define a more general class of graphs whose vertices are the k-subsets of an n-set. All these graphs have the property that adjacency of vertices depends only on the size of the intersection of the subsets, and all of them admit the symmetric group acting vertex-transitively.

Let I be a subset of {0,1,…k−1}. We form a graph G_I on the vertex set by letting two subsets be declared adjacent if the size of their intersection is an element of the set I. In the case where I = ∅ or I = {0,1,…k−1}, the graph G_I is the null graph or the complete graph, respectively; we ignore these cases. Also we note that if I and J are complementary subsets of {0,1,…k−1}, then the graphs G_I and G_J are complements of each other (the edges of one are the non-edges of the other). So we have (2^k−2)/2 = 2^k−1−1 complementary pairs of “interesting” graphs.

Among these graphs, those for which I = {t,…k−1} play a special role, as we have seen; the largest possible null subgraphs of these are Steiner systems, and the largest possible complete subgraphs are EKR families. Thanks to the theorems of EKR and Keevash, we know that, for n sufficiently large, the product of the sizes of the largest complete and null subgraphs in these graphs is equal to the number of vertices.

Here is the big conjecture, which appears in a paper of mine with Mohammed Aljohani and John Bamberg, accepted for publication in Portugaliae Mathematica in the last week of September this year:

There is a function F on the natural numbers such that, given k and n, if n ≥ F(k) and the graph G_I defined above has the property that the product of the sizes of the largest complete and null subgraphs is equal to the number of vertices, then I or its complement is equal to {t,…k−1} for some value of t, and the divisibility conditions for a Steiner system S(t,k,n) are satisfied.

This was previously known for k = 2 and k = 3; in the paper, among other things, we prove it for k = 4, by a method which seems likely to generalise to larger values. Unfortunately, of course, the number of graphs we have to consider grows exponentially with k, as we saw.

The graph G_I has the property that its edge set is the disjoint union of the edge sets of the graphs G_{i} for i ∈ I. These graphs make up an association scheme. This means that their adjacency matrices commute and that their span over the real numbers is an algebra (that is, closed under multiplication).

Association schemes are very interesting combinatorial objects, with applications in group theory, computation, statistics, and elsewhere. But I will forbear to describe them further here, except to note that they provide a context for the problems described here, and asking similar questions in other association schemes leads to challenging research problems. Indeed, there are a number of inequalities for the sizes of complete and null subgraphs in terms of the eigenvalues of the matrices in the association scheme; and there are ways of computing these eigenvalues from counting information about the association scheme. In particular, the result that says that the product of complete and null graph sizes in any union of graphs in the association scheme is at most the number of points, irrespective if the scheme has any symmetry or not: transitivity on vertices is not necessary in this case.

An example

Although we conjecture that things are orderly when n is large, there are many fascinating examples not following this pattern for smaller n. Here is an example.

The Fano plane

The picture shows the famous Fano plane: seven points with seven 3-subsets, any two meeting in exactly one point. So it is a Steiner triple system S(2,3,7) (whose triples are called lines), and is also a complete subgraph of G_{1}. Our general inequality that the product of complete and null subgraph sizes is at most the number of points shows that a null subgraph of G_{1} has size at most 5.

There are null subgraphs of size 5: the EKR sets (all triples containing two given points give examples). But there are others with a completely different structures. Consider, for example, any line of the Fano plane together with the four 3-sets disjoint from it. Any two of these five sets meet in 0 or 2 points. (In the picture, 135, 137, 157, 357 and 246 is an example.)