Regenerative distributions on number partitions
Earlier this week, my new colleague Sasha Gnedin gave a talk on “Regenerative Combinatorial Structures”. Regeneration is what certain newts do: if they lose their tail, a limb, or even an eye, they grow a new one. Sasha’s idea was to take this principle and apply it to finite combinatorial objects, where removing something leaves a copy of what is there before. Now clearly a single structure cannot have this property; even an infinite chain of structures cannot have it in any interesting way. So, as he is a probabilist, he considered probability distributions, specifically on partitions of a positive integer.
Let Πn denote the set of all partitions of the integer n. A collection of probability distributions on Πn (one for each n) is
- coherent if subtracting one from a random part of an element of Πn gives a random element of Πn-1;
- regenerative if removing an entire part of an element of Πn gives a random element of the set of partitions of what remains.
Of course, there are some distributions to be specified here. For coherence, we could choose the part uniformly, or with probability proportional to its size (the latter more natural if we think of a partition of n as describing n balls in urns, and we choose a random ball to remove, rather than a random urn to remove a ball from.
The goal is to describe such distributions uniformly. This is done by a two-stage procedure, which I will illustrate with a lovely example, giving rise to the distributions which are the cycle lengths of a random permutation. (That is, the probability of an element of Πn is the probability that a random element of the symmetric group Sn has those cycle lengths.)
The first stage of the process is to take the unit interval and break it into intervals by some random procedure. Then we colour each interval with a different colour (this is known as Kingman’s paintbox).
For the second step, we choose n independent random points from the unit interval according to some distribution, which may have no connection with the distribution used in the first part. Then we have a partition of n given by the number of points of each colour in the selection.
For example, suppose that the intervals are obtained by the “stick-breaking procedure”. Break the unit interval at a random point. Then keep the left-hand part, and break the right-hand part at a random point. Continue this infinitely. Then in the second step, choose the points from the uniform distribution on the interval. This gives rise to the distribution of cycle lengths of a random permutation (a lovely fact which I didn’t know).
Sasha had many variants on this, and remarkably was able to find explicit formulae for the distribution on partitions in many cases.
Now where have I seen that before?
I have caught a nasty cough, which kept me awake last night. So I started thinking about this, and what it reminded me of: the answer is, countable homogeneous structures such as the random graph! As I have described earlier, this object has very strong regeneration properties: remove a finite set of vertices or edges, or a finite set of vertices and their common neighbours, or the edges of a locally finite subgraph, and what we obtain is still isomorphic to the random graph. Also, as the name suggests, it is produced by a very simple probability distribution.
I will talk just about relational structures: one of these consists of a set carrying various relations. So graphs, partially ordered sets, metric spaces are examples. Substructures are always induced, consisting of a subset with all the instances of relations involving its points. A relational structure is homogeneous if every isomorphism between finite substructures extends to an automorphism of the whole structure. Also, my infinite structures will be no larger than countable.
Of course, most relational structures are not homogeneous; but any structure can be made homogeneous by adding extra relations. (This triviality just asserts that any permutation group on a countable set, which is closed in the topology of pointwise convergence, is the automorphism group of some homogeneous structure.) Also, many interesting structures either are homogeneous (such as the random graph), or can be made so very easily (such as the “generic” bipartite graph, which is homogeneous as a graph with bipartition).
According to the terminology of Roland Fraïssé, the age of a countable relational structures is the class of finite structures which can be embedded into it. Now a class A of finite structures is the age of a countable homogeneous structure if and only if it satisfies the four conditions
- it is closed under isomorphism;
- it is closed under taking substructures;
- it contains only countably many non-isomorphic members;
- it has the amalgamation property, that is, given structures A, B1 and B2 in A, and embeddings fi : A→Bi for i=1,2, there is a structure C in A and embeddings gi: : Bi→C so that the obvious diagram commutes. [In other words, if B1 and B2 have a common substructure, they can be glued together amalgamating this substructure.]
Moreover, if these properties hold, the countable homogeneous structure M with age A is unique up to isomorphism; it is called the Fraïssé limit of A.
There is a strengthening of the amalgamation property known, naturally enough, as the strong amalgamation property, which asserts that the amalgamation can be done without making any extra identifications: so when we glue the two structures together along A, no points outside A become identified.
This property is equivalent to the following nice property of the Fraïssé limit: in the automorphism group, the stabiliser of any finite set of points has all its orbits on the remaining points infinite.
There are some very interesting structures which are homogeneous but for which this stronger property fails, for example “treelike objects”. However, I will be interested in the class for which it is true. The following result holds:
The age of a homogeneous structure M has the strong amalgamation property if and only if deleting any finite number of points of M gives a structure isomorphic to M.
This is the analogue of Gnedin’s coherence condition.
Now what about probability distributions?
Not so long ago, I discussed recent work of Fedor Petrov and Anatoly Vershik, where they constructed an exchangeable measure on countable graphs which is concentrated on Henson’s homogeneous universal triangle-free graph (that is, the random graph is isomorphic to Henson’s graph with probability 1), and a beautiful generalisation of this due to Nate Ackerman, Rehana Patel and Cameron Freer, who had managed to give a necessary and sufficient condition on a structure M for there to be an exchangeable measure concentrated on M. Their condition is precisely that the stabiliser of a finite set in the automorphism group of M has all its orbits on the remaining points infinite!
Now there is an even closer connection with what Gnedin was talking about. The method used by Petrov and Vershik, and followed by Ackerman, Freer and Patel, was to define a structure on the real numbers which is almost of the kind desired (thus, in the Petrov–Vershik case, the measure of the set of triangles is zero), and then obtain the countable structure by choosing countably many real numbers independently and taking the induced structure on them.
I will be amazed if there is not some deeper connection here!