This semester, Han Yu has run a reading group on Arithmetic Combinatorics. Mostly we have been following the excellent notes by Tuomas Orponen. I gave two talks. The first was a proof of the result of Ellenberg and Gijswijt on cap-sets over GF(3). The paper is just over 3 pages long, so no additional notes are necessary; in any case, I discussed it here last year after hearing Ben Green’s talk in Singapore.

My second talk was on sum-free sets, and was rather more discursive, so here is a summary.

### The context

A set of natural numbers is *k-AP-free* if it contains no *k*-term arithmetic progression, and is *sum-free* if it contains no solution to *x*+*y* = *z*.

Three big theorems in additive combinatorics are:

**Theorem 1 (van der Waerden)** For a natural number *k* > 2, the set **N** cannot be partitioned into finitely many *k*-AP-free sets.

**Theorem 2 (Roth–Szemerédi** For a natural number *k* > 2, a *k*-AP-free set has density zero.

**Theorem 3 (Schur)** The set **N** cannot be partitioned into finitely many sum-free sets.

At first sight, one would like a theorem to complete the pattern, asserting that a sum-free set has density zero. But this is false, since the set of all odd numbers is sum-free and has density 1/2. What follows could be motivated as an attempt to find a replacement for the missing fourth theorem.

### A bijection with Cantor space

The *Cantor space* *C* can be represented as the set of all (countable) sequences of zeros and ones. It carries the structure of a complete metric space (the distance between two sequences is a monotonic decreasing function of the index of the first position where they differ) or as a probability space (corresponding to a countable sequence of independent tosses of a fair coin).

We define a bijection between Cantor space and the set **S** of all sum-free subsets of **N**. Given a sequence *x* in *C*, we construct *S* as follows:

Consider the natural numbers in turn. When considering *n*, if *n* is the sum of two elements already put in *S*, then of course *n* is not in *S*. Otherwise, look at the first unused element of *x*; if it is 1, then put *n* into *S*, otherwise, leave *n* out of *S*. Delete this element of the sequence and continue.

For example, suppose that *x* = 10110…

- The first element of
*x*is 1, so 1 ∈*S*. - 2=1+1, so 2 is not in
*S.* - 3 is not in
*S*+*S*; the next element of*x*is 0, so 3 is not in*S*. - 4 is not in
*S*+*S*; the next element of*x*is 1, so 4 is in*S*. - 5=1+4, so 5 is not in
*S*. - 6 is not in
*S*+*S*; the next element of*x*is 1, so 6 is in*S*. - …

So *S* = {1,4,6,…}.

### Baire category

The notion of “almost all” in a complete metric space is a *residual set*; a set is residual if it contains a countable intersection of dense open sets. Thus, residual sets are non-empty (by the Baire Category Theorem); any countable collection of residual sets has non-empty intersection; a residual set meets every non-empty open set; and so on.

A sum-free set is called *sf-universal* if everything which is not forbidden actually occurs. Precisely, *S* is sf-universal if, for every subset *A* of {1,…,*n*}, one of the following occurs:

- there are
*i,j*∈*A*with*i*<*j*and*j−i*∈*S*; - there exists
*N*such that*S*∩[*N*+1,…,*N*+*n*] =*N*+*A*,

where *N*+*A* = {*N*+*a*:*a*∈*A*}.

**Theorem** The set of sf-universal sets is residual in **S**.

**Theorem (Schoen)** A sf-universal set has density zero.

Thus our “missing fourth theorem” asserts that almost all sum-free sets (in the sense of Baire category) have density zero.

There is a nice application. Let *S* be an arbitrary subset of **N**. We define the *Cayley graph* Cay(**Z**,*S*) to have vertex set **Z**, with *x* joined to *y* if and only if |*y−x*|∈*S*. Note that this graph admits the group **Z** acting as a shift automorphism on the vertex set.

**Theorem**

- Cay(
**Z**,*S*) is triangle-free if and only if*S*is sum-free. - Cay(
**Z**,*S*) is isomorphic to Henson’s universal homogeneous triangle-free graph if and only if*S*is sf-universal.

So Henson’s graph has uncountably many non-conjugate shift automorphisms.

### Measure

In a probability space, large sets are those which have measure 1, that is, complements of null sets. Just as for Baire category, these have the properties one would expect: the intersection of countably many sets of measure 1 has measure 1; a set of measure 1 intersects every set of positive measure; and so on.

The first surprise is that measure and category give entirely different answers to what a typical set looks like:

**Conjecture** The set of sf-universal sets has measure zero.

Although this is not proved yet (to my knowledge), it is certain that this set does not have measure 1.

Given the measure on **S**, and our interest in density, it is natural to ask about the density of a random sum-free set. This can be investigated empirically by computing many large sum-free sets and plotting their density. Here is the rather surprising result.

The spike on the right corresponds to density 1/4 and is explained by the following theorem.

**Theorem**

- The probability that a random sum-free set consists entirely of odd numbers is about 0.218 (in particular is non-zero).
- Conditioned on this, the density of a random sum-free set is almost surely 1/4.

A word about this theorem. Suppose that we have tossed the coin many times and found no even numbers. Then we have chosen the odd numbers approximately independently, so the next even number is very likely to be excluded as the sum of two odd numbers, whereas the next odd number is still completely free. So the probability of no even numbers is not much less than the probability of no even numbers in the first *N* coin tosses. Moreover, since the odd numbers are approximately independent, we expect to see about half of them, and so about a quarter of all numbers.

Other pieces can also be identified. Let **Z**/*n***Z** denote the integers modulo *n*. We can define the notion of a sum-free set in **Z**/*n***Z** in the obvious way. Such a sum-free set *T* is said to be *complete* if, for every *z* in (**Z**/*n***Z**)\*T*, there exist *x,y* in *T* such that *x*+*y* = *z* in **Z**/*n***Z** . Now the theorem above extends as follows. Let **S**(*n,T*)$ denote the set of all sum-free sets which are contained in the union of the congruence classes *t* (mod *n*) for *t*∈*T*.

**Theorem** Let *T* be a sum-free set in **Z**/*n***Z**.

- The probability of
**S**(*n,T*) is non-zero if and only if*T*is complete. - If
*T*is complete then, conditioned on*S*∈**S**(*n,T*), the density of*S*is almost surely |*T*|/2*n*.

In the figure it is possible to see “spectral lines” corresponding to the

complete modular sum-free sets {2,3} (mod 5) and {1,4} (mod 5) (at density 1/5) and {3,4,5} (mod 8) and {1,4,7} (mod 8) (at density 3/16).

The density spectrum appears to be discrete above 1/6, and there is some

evidence that this is so. However, a recent paper of Haviv and Levy shows

the following result.

**Theorem** The values of |*T*|/2*n* for complete sum-free sets *T* in **Z**/*n***Z** are dense in [0,1/6].

However, this is not the end of the story. Neil Calkin and I showed that the event that 2 is the only even number in *S* has positive (though rather small) probability. More generally,

**Theorem** Let *A* be a finite set and *T* a complete sum-free set modulo *n*. Then the event *A* ⊆ *S* ⊆ *A*∪(*T* (mod *n*)) has positive probability.

**Question** Is it true that a random sum-free set almost surely has a density? Is it further true that the density spectrum is discrete above 1/6 but has a continuous part below 1/6? Is it also true that the probability that a sum-free set has zero density is 0?

In this connection, consider the following construction of Calkin and Erdős. Let α be an irrational number, and define *S*(α) to be the set of natural numbers *n* for which the fractional part of *n*α lies between 1/3 and 2/3. It is easy to see that *S*(α) is sum-free and has density 1/3. However this does not resolve the question, since the event *S* ⊆ *S*(α) for some α has probability zero.

However, there might be other examples along these lines …

### Ultimate periodicity

A sequence *x* is *ultimately periodic* if there exist positive integers *n* and *k* such that *x*_{i+k} = *x _{i}* for all

*i*≥

*n*. Analogously, a sum-free set

*S*is

*ultimately periodic*if (

*i*∈

*S*) ⇔ (

*i*+

*k*∈

*S*) for all

*i*≥

*n*.

It is easy to see that, in our bijection from sequences to sum-free sets, a sequence which maps to an ultimately periodic sum-free set must itself be ultimately periodic. What about the converse?

**Question** Is it true that the image of an ultimately periodic sequence is an ultimately periodic sum-free set?

After some investigation, Neil Calkin and I conjectured that the answer is “no”. There are some ultimately periodic sequences (the simplest being 01010 repeated) for which no sign of periodicity has been detected despite computing nearly a million terms. These sets are fascinating, and seem sometimes to exhibit an “almost periodic” structure; they settle into a period, which is then broken and a longer period established, and so on.

Is there a typo in your definition of sf-universal, Peter? It looks like i and j should be in A rather than {1,…,n}.

Yes, you are right. Fixed now.