## Depth-first search

On Friday we had three lovely talks by Benny Sudakov and two of his students from UCLA, who are currently visiting Cambridge. The students both gave beautifully clear talks; Choongbum Lee wrote on the board, while we got Huang Hao’s laptop talking to the projector with no trouble whatsoever. There was a very good crowd, with people from all over London and further afield as well as locals who don’t always come to CSG meetings.

I want to say a bit about Benny Sudakov’s talk. This is joint work with Michael Krivelevich. It is a new, very simple, proof of one of the fundamental original results about random graphs in the classic paper of Erdős and Rényi. This is the theorem about the threshold for a “giant component” of a random graph. If we take n vertices, and join each pair independently with probability p, then (with ε a small positive number),

• if p = (1−ε)/n, then all connected components are small (size Oε(log n) with high probability);
• if p = (1+ε)/n, then there is a “giant component” of size Ωε(n) with high probability.

Here the phrase “with high probability” means “with probability tending to 1 as n→∞”; Oε(log n) means bounded above by c·log n, where the constant c depends on ε; and Ωε similarly means bounded below.

Apart from its simplicity, the new proof does several things:

• it gives explicitly the dependence of the constants on ε in the O and Ω results;
• it shows that, in the second case, the “giant component” contains a path of length Ωε(n), with again an explicit constant;
• it extends to similar situations such as random subgraphs of a fixed graph, and positional (Maker-Breaker) games on the complete graph.

The idea behind the proof in the supercritical case is just depth-first search. I never took a course in computer science, so while I know what depth-first search is, I would have a job to write such a clear specification of it as Benny gave us.

We wish to explore a graph. While doing so, we maintain three sets of vertices: S, the set of vertices which we have already seen; U, the set of vertices currently under investigation; and T, the remainder which have not yet been considered. The set U has the structure of a stack (i.e. last-in, first-out). We initialise with S = U = ∅, T = V (where V is the whole vertex set); when U = T = ∅ and S = V, we are finished. The vertices of the graph are totally ordered.

Here is one step. If U is non-empty, let u be the “top” vertex (the last one added). If u has any neighbours in T, we choose the first one, and add it to U; if not, then we remove u from U and add it to S. On the other hand, if U is empty but T is not, we choose the first vertex in T and add it to U. If neither move is possible, we are finished.

Note a couple of things about the algorithm:

• at each step, a single vertex is moved, either from T to U, or from U to S;
• There are no edges between S and T;
• between successive “emptyings” of U, we are in a single connected component of the graph;
• the stack U is actually a path in the graph, since a new vertex added is adjacent to the existing top vertex.

We apply depth-first search (DFS) to the random graph. We are not actually searching for anything, merely watching the algorithm as it runs. We might as well be lazy; rather than generate the graph and then process it, we only toss our biased coin for a given pair of vertices when we need to know whether that pair is an edge or not.

Now suppose that p = (1+ε)/n. We run the process until we have tossed the coin a predetermined number of times. By properties of the binomial distribution, we can put a lower bound on the number of heads we have seen (that is, the number of edges known to be present) at this point. Assuming, for a contradiction, that there is no long path, we know that U is always small; since all edges meet U, we can put an upper bound on the number of edges. But these bounds conflict. And that is it, apart from the book-keeping! 