Pilsen, days 1 and 2

Pilsen, Republic Square

The conference on “Symmetry vs Regularity” in Pilsen has begun.

I hesitate at trying to describe the talks, since the first day alone had 13 talks, and the second day 19, of lengths ranging from ten minutes to an hour, with no parallel sessions.

Random thoughts will have to suffice.

The meeting largely concerns the structures variously known as association schemes or coherent configurations. These had at least three independent origins: by R. C. Bose and his students and collaborators in experimental design in statistics; by Issai Schur in permutation group theory; and by Weisfeiler and Leman in studying the graph isomorphism problem (it is the fiftieth anniversary of their paper that we are specifically celebrating at this conference).

Subsequently the methods began to coalesce: Jaap Seidel recognised the connections between Bose’s and Higman’s work, and as communications between the Soviet Union and the west improved we were able to appreciate each other’s contribution. Delsarte applied Bose’s methods to coding theory and design theory, and Bannai and Ito wrote an important book laying out the essentials of the theory.

I began life in Schur’s school in this respect. As a beginning D.Phil student at Oxford, I was given by Peter Neumann his long preprint on extending Wielandt’s theorem on primitive groups of degree 2p (p prime) to degree 3p. Then Donald Higman visited Oxford and lectured on coherent configurations, perhaps his first complete presentation of the theory he had developed; one of the key ingredients, what we now sometimes call the “Bose–Mesner isomorphism” from the algebra generated by the basis matrices to the algebra generated by the intersection matrices, he attributes to Wielandt (see the Appendix of his 1967 paper on intersection matrices). Wielandt was a student of Schur and has much to say on “The method of Schur” and its connection with representation theory in his book on Finite Permutation Groups, which is dedicated to Schur.

Anyway, I was one of two students assigned to take notes of Higman’s lectures (the other was Susannah Howard); the notes, corrected by Higman, were published in the Mathematical Institute Lecture Notes series.

This leads me to the talk by Alexander Gavrilyuk on “Graham Higman’s method”. I bear some of the responsibility for this. Graham Higman gave a formula for the character of an automorphism group of an association scheme on one of its eigenspaces, in terms of (in Delsarte’s terminology) the Q-matrix of the scheme and the functions ai, where ai(g) is the number of points x such that the pair (x,xg) satisfies the ith relation of the scheme. He used this to show that the hypothetical Moore graph of valency 57 could not have a transitive automorphism group. Higman never published this (though he addressed the London Mathematical Sociey on it), but I thought it was potentially a useful technique and gave it as an example in my Permutation Groups book.

Meanwhile, C. T. Benson had used it to study automorphisms of generalised quadrangles and beyond, and later authors had extended this to other generalised polygons, perhaps without realising that it works for any association scheme. Two years ago in Singapore, Stefaan de Winter talked about applications of this, and I briefly discussed Higman’s work with him. Alexander did mention some more recent work; it seems people are on the case at last.

Another small observation. Walter Feit and Graham Higman proved their theorem by considering the collinearity graph of the generalised polygon (whose vertices are the points, adjacent if collinear); this graph is distance-transitive, so the association scheme is symmetric and commutative, and eigenvalues tell you everything. Donald Higman gave a new proof of the theorem using the coherent configuration based on flags of the geometry; this is non-commutative and one has to study the more difficult 2-dimensional representations, but the payoff is that one gets more refined information, including “Higman’s inequality” for generalised quadrangles. This approach fits better with the BN-pair structure in the group case.

My account of the Singapore meeting is here.

But there is a small historical mystery here. Did Benson get the idea from the Feit–Higman paper, as Alexander thought possible? (I have looked through the paper and I don’t think this is very likely.) Was there another connection? I am inclined to believe that the two inventions were completely independent. Perhaps it should be the Benson–Higman method?

In the evening there were four short talks. Two nice things for me. Bogdana Oliynyk talked about an extension of an old paper I wrote with Sam Tarzi about diagonal limits of cubes; she and her co-authors have extended it to Hamming spaces over arbitrary finite alphabets. Then Alicia Roca talked about invariant subspaces. There was some nice combinatorics in there, but a key paper included Bill Longstaff as an author, and this took me back. He was an undergraduate with me, and about 40 years ago I met up with him on a visit to Perth, where he told me some of this material. I think it was on the same visit that Cheryl Praeger and I proved the theorem about dual polar graphs that Sasha Ivanov mentioned in his talk opening the conference.

A highlight for me of the second morning was the talk of Oleg Verbitsky, who looked at the Weisfeiler–Leman algorithm from the viewpoint of logic. Success of the k-dimensional version corresponds to a formula in first-order logic with counting quantifiers which characterises the graph up to isomorphism, with k+1 variables; the quantifier depth determines the number of rounds required. This was explained with a very nice Ehrenfeucht game on the pair of graphs. The two players are provided with matching collections of coloured pebbles. The first player is trying to demonstrate non-isomorphism of the two graphs, and the second player to prevent this. The pebbles correspond to the variables in the formula, and the number of rounds of the game to the quantifier depth.

The highlight of the afternoon was the talk by John Wilmes. He was a student of Laci Babai, and he and Xiaorui Sun took the first step since 1980 towards a conjecture of Babai. Back then, Babai proved that primitive but not 2-transitive permutation groups had order bounded by exp(n1/2) (this is slightly inaccurate, since we might have to put some log n factors in the exponent; I will be similarly cavalier throughout this comment). This is best possible, since both the symmetric group acting on 2-sets, and the automorphism group of the square lattice, are primitive groups with order about this value. Indeed Babai’s bound and reality differ by a constant and a single log factor in th exponent.) But actually Babai proved much more; his bound applies to the automorphism group of any non-trivial coherent configuration.

About the same time, the Classification of Finite Simple Groups was announced, and using this, Babai’s result for groups could be deduced. Indeed, much stronger results can be proved; with “known” exceptions (essentially powers of the symmetric or alternating group acting on k-sets, wich some group permuting the coordinates), the orders of primitive groups don’t exceed nlog n+1. Babai conjectured that with a similar list of exceptions, the order of the automorphism group of a primitive coherent configuration could not exceed exp(nε) for some arbitrarily small ε. His theorem gave ε = 1/2; and, of course, CFSG is no help here, except perhaps as a guiding principle.

Now Sun and Wilmes have reduced this to ε = 1/3. So, in particular, they have to deal with the two examples with very large automorphism groups, which they do by building “clique geometries” for them, which turn out to be the partial geometries in the sense of Bose (who was perhaps the first to use geometric arguments on cliques to characterise strongly regular graphs). It is a beautiful combinatorial argument; but here I will only say, you should have been there! It was also a beautifully planned and executed talk, with just the amount of detail that could be comfortably fitted into the time allowed.

One thing from the evening session: Mikhail Kagan was using resistance distance as an alternative to the Weisfeiler–Leman algorithm. This is something I have an interest in, as it connects (among other things) with statistical optimality. I would like to see the result of resistance distance applied to the collaboration graph!

Posted in events | Tagged , , , | Leave a comment

British Mathematical Colloquium: satellite meetings

The BMC was followed by three satellite meetings, on Semigroups and Applications, Geometric Group Theory, and Ergodic Theory. I had a lot of catching up and other jobs to do, and so was unable to go to more than a few of these talks. I managed three.

Gareth Tracey talked about crowns in finite group theory. If we are studying minimal generating sets in finite groups, it is useful to understand groups G which require more generators than any proper quotient; these are crowns. Given an arbitrary group G, for any normal subgroup N, we know that the number of generators of G/N does not exceed the number for G; so, if we choose N maximal such that equality holds, then G/N is a crown. Gareth has used his results on crowns for various things; in particular, he gave us a preview of his result with Colva Roney-Dougal on the number of subgroups of the symmetric group Sn.

Tom Coleman told us about his work (much of it in his recent PhD thesis) on generation, cofinality and strong cofinality, and the Bergman property for various transformation semigroups on an infinite set, such as all maps, injective maps, surjective maps, bijective maps, partial maps, and bijective partial maps on a countable set, and analogous things for homomorphisms etc of the random graph. The Bergman property for a structure asserts that, if S is a generating set, then any element of the structure is a word of bounded length in elements of S.

Justine Falque talked about her work with Nicolas Thiéry on the orbit algebra of a permutation group in the case where the number of orbits on n-sets is bounded by a polynomial in n. They show that the algebra is finitely generated and Cohen–Macaulay, so that the generating function for the number of orbits on n-sets is a rational function of the form P(x)/Π(1-xdi), where P is a polynomial. I have already discussed this here; I have read the short version on the arXiv and eagerly await the full version.

Posted in events | Tagged , , , , , | Leave a comment

British Mathematical Colloquium, days 3 and 4

The day began with the plenary talk by Paul Seymour, whom I’ve known for longer than almost anyone else at the meeting.

He explained that there are many different “ordering” relations on graphs; the two he considers most important are the minor and induced subgraph orders. Roughly speaking, he worked on the minor ordering in the second millennium and the induced subgraph ordering in the third.

An induced subgraph of a graph is obtained by throwing away some vertices and the edges incident with them; you are not allowed to throw away an edge within the set of vertices you are keeping. Paul began with the general problem: given a graph H, can you determine the structure of graphs G containing no induced copy of H?

The first issue is what you mean by “structure” here. Paul thinks of it as a construction for such graphs. But even this needs explaining. You can construct all triangle-free graphs by taking a set of vertices and adding edges, taking care never to create a triangle. But clearly this is a hopeless construction and tells you nothing about triangle-free graphs!

The answer is known in embarrassingly few cases:

  • A graph with no induced path of length 1 is a null graph.
  • A graph with no induced path of length 2 is a disjoint union of complete graphs.
  • Graphs with no induced path of length 3 form a well-understood class – for such a graph G, either G or its complement is disconnected, so it is clear how they can be built inductively from smaller graphs by taking disjoint unions with all or no edges between. (These graphs have many names, including cographs or N-free graphs.)
  • A sequence of seven papers of Paul, with Maria Chudnowsky, determines the structure of claw-free graphs, those containing no K1,3; there are many interesting examples.

And that’s it! Not even for a 4-cycle is the answer known. There is a structure theory for bull-free graphs modulo the structure of triangle-free graphs and their complements, which again is not easy. (The bull has a triangular face, with horns or pendant edges at two of its three vertices.)

If we are content with less than a complete description, there are various conjectures you can try: does a forbidden subgraph force the existence of a “large” complete or null subgraph? For arbitrary graphs on n vertices we can’t do better than log n (Ramsey’s theorem together with the probabilistic method); can we achieve nc (“polynomial”), or even cn (linear): Can we find two large sets with all or no possible edges between them?

Another question asks for χ-boundedness, the property that the chromatic number χ(G) is bounded by a function of the clique number ω(G). The perfect graph theorem, proved by Paul and collaborators, proves the conjecture of Berge by showing that if G contains no odd hole (induced odd cycle) or odd antihole (induced complement of an odd cycle) then G has equal clique and chromatic numbers. He went on to discuss some new and powerful results he and others have proved, where we forbid just some holes.

There was a lot more but that will do for now.

David Conlon’s advertised title was “How to build a hypergraph expander”, and he did indeed show us this in the second part of his talk. But his real subject was “The unreasonable effectiveness of mixing algebra and probability”. The first half of his talk mainly concerned the function ex(n,H), the maximum number of edges in a graph G on n vertices not containing H as a (not necessarily induced) subgraph. The Erdős–Stone theorem gives as an upper bound a fraction (1−1/(χ(H))+\epsilon) of all edges of Kn, and Turán graphs show this is tight. The problem occurs when χ(H) = 2, that is, H is bipartite, when the theorem only gives a fraction o(1), that is, o(n2 edges. There is some evidence that the correct asymptotics should be around cn2−1/s, where s is the smaller bipartite block of H. This is known to hold in a few cases, including K2,2 and K3,3, where algebraic constructions over finite fields realise the lower bound. The probabilistic method straightforwardly gives about n1−1/s−1/t. David and coauthors have improved this by a method involving choosing “random algebraic varieties”.

For hypergraph expanders, even the best definition is not known: spectral, combinatorial, rapid mixing of some random walk, topological (Gromov), coboundary, …? David has constructions involving rapid mixing of a random walk on pairs on the neighbourhoods of vertices in a Cayley graph in an elementary abelian 2-group. As he remarked, received wisdom is that abelian groups are bad for rapid mixing, but the commutative property was essential in his argument.

Last before lunch was a beautiful talk by Simon Smith on infinite permutation groups. He assumed some topological knowledge (he liked his groups to be totally disconnected and locally compact (tdlc): the study of locally compact groups reduces to two cases, Lie groups and tdlc groups). His permutation groups were transitive and subdegree-finite (this means that a point stabiliser has all its orbits finite – automorphism groups of locally finite connected graphs are examples). There is a close connection: a closed subdegree-finite group is tdlc, and a tdlc group has a nantural action as a subdegree-finite permutation group.

An important construction for finite permutation groups is the wreath product in its product action. Simon has another construction which he calls the box product. If X and Y are the domains of G and H, take a collection of copies of X, with complete graphs on each; and glue them together, any two meeting in one point and the copies through any point indexed by Y. Then there is a way to make a group act on this graph, so that the stabiliser of a lobe (copy of X) has G acting, while the stabiliser of a point has H acting on the lobes through that point. This is the box product. He has a theorem for when the box product is primitive, mirroring what happens for wreath product, and an O’Nan–Scott type theorem for primitive subdegree-finite groups.

After lunch and the (brief) business meeting of the Colloquium, I returned to the algebra workshop to hear Brent Everitt talking about three different ways to attach (co)homology to a hyperplane arrangement (a finite set of hyperplanes in a vector space). Note that the hyperplanes and their intersections have the structure of a lattice. The first method is simply to remove the hyperplanes and consider what’s left. Over the real numbers, this falls into many pieces, but over the complex numbers it is connected, and the Orlik–Solomon theorem gives its cohomology over the integers: there is no torsion, and the Betti numbers are expressible in terms of the Möbius function of the lattice.

The second method is to form the chain complex of the lattice (with top and bottom elements removed), the simplicial complex whose simplices are the chains. This is homotopy-equivalent to a bunch of spheres; the Goresky–MacPherson th eorem connects its reduced homology to the previous case.

The third solution puts a sheaf of R-modules on the lattice, and take its simplicial cohomology with local coefficients. There is a canonical sheaf: each element of the lattice is a subspace; just use that subspace.

I ducked out and went to the history stream to hear Rosemary Bailey talk about Latin squares. Much of it was familiar to me, but there were some eye-openers, including the use in an experiment on diets and slaughtering times for sheep, reported in France in 1770, a possible proof (now lost) of the non-existence of orthogonal Latin squares of order 6 by Clausen in 1842, and a row between Fisher and Neyman about whether using a Latin square in an experiment gives biased results (with a shocking recent tailpiece).

I missed the London Mathematical Society meeting because I had a visitor; then there was a wine reception followed by a very nice dinner. I wish I had felt well enough to enjoy it more; but at least I could eat it, which is a sign of progress!

The last morning had just two morning talks and one plenary.

The first morning talk I went to was by Vicky Gould, who gave us a crash course in semigroup theory followed by the biorder on the idempotents of a semigroup and the “free idempotent-generated semigroup” generated by a biordered set. This is a fairly recent topic where the arguments are not straightforward. It is a free object whose generators correspond to the idempotents, and relations ef = g if the product of e and f is g in the original semigroup (or, abstractly, if e and f are comparable in one of the orders). Nice results but I won’t attempt to confuse you with a proper account.

Second, Daniel Král’ talked about graphons and their analogues for permutations, permutons. A sequence (Gn) of, say, graphs is said to converge if the probability that a random set of size |H| in Gn induces a copy of H converges for all graphs H. So what does the sequence converge to? This is more difficult, and was the problem solved by Borgs, Chayes, and Lovász by inventing graphons. It appeared that graphons often correspond to solutions to extremal graph problems which are either unique, or can be made so by adding extra relations. But Dan and his coauthors have shown that this is very far from being the case; he showed us pictures of some enormously complicated graphons refuting the obvious conjectures.

After coffee we had the final talk of the conference, a plenary on “Products of random matrices” by Marcelo Viana. Furstenberg and Kesten showed that certain limits always exist; these turned out to be the extremal Lyapunov exponents. Marcelo and collaborators showed that these numbers depend continuously on the input. This can be widely generalised, to distributions on the group GL(d), and to things called linear cocycles. In the first case it is not enough that the distribuions are close in the weak* topology, but also that their (compact) supports are close in the Hausdorff topology; he showed us examples to demonstrate this. Most of the results have generalisations in some form. It was very clear, but again my brain was full.

Posted in events | Tagged , , , , , | 3 Comments

British Mathematical Colloquium, day 2

Things may be a bit briefer from now on.

The first plenary talk was by Irit Dinur on the unique games conjecture. I think I understood roughly what this conjecture says, but what it has to do with games, unique or otherwise, still escapes me completely.

Some NP-hard problems are constraint satisfaction problems; examples include 3-COL (is a given graph 3-colourable?) and 3-SAT (is a conjunctive normal form formula where each clause has 3 literals satisfiable?) In each case, if the answer is no, we could ask what is the largest fraction of constraints which can be satisfied. For example, with 3-SAT, there is one constraint for each clause. By assigning Boolean values randomly to the literals, we see that seven of the eight assignments to each clause will make it true, and so the expected number of satisfied clauses is at least a fraction 7/8 of the total. In the same way, an edge gives a constraint on a colouring (its ends must have different colours), and so a random colouring will have at least 2/3 of the edges properly ccoloured.

The PCP-theorem says that there exists a positive δ such that it is NP-hard to distinguish betweem “all constraints satisfied” and “a fraction smaller than 1−δ of constraints satisfied.

The unique games conjecture says something about a particular kind of CSP called “label cover with 1:1 constraints”, asserting that it is hard to distinguish between fraction ε and 1&mimus;ε of constraints satisfied. The breakthrough result is this with the upper value replaced by (1/2)−;ε. This has a lot of consequences in related areas of complexity, but for fear of getting things wrong I will say no more.

After that, there was a wonderful talk by Clifford Cocks on the history of public-key cryptography. James Ellis at GCHQ had discovered the principles of public-key cryptography in 1969, and Cocks, arriving shortly afterwards, had constructed a practical scheme for doing it, almost identical with what we now call RSA (although technology wasn’t able to do the business until the 1980s); a later arrival, Malcolm Williamson, discovered another method, almost precisely the same as Diffie–Hellman key exchange.

On the other side of the Atlantic, Whitfield Diffie gave up his job in 1972 to learn about cryptography, and began working with Martin Hellman in 1974; by 1976 they had laid down the principles of PKC and had discovered Diffie–Hellman key exchange. Rivest, Shamir and Adelman read their paper and got to work; Rivest and Shamir tried many ideas, Adelman knocked them all down, until they came up with the RSA method. In one respect the Americans had advanced beyond the British; Diffie and Hellman realised the importance of authentication, and had discovered a method for providing it.

Having had a morning of computer-related things, I missed Mike Fellows’ talk on parameterized complexity, and listened instead to Marta Mazzocco on the geometry of the q-Askey scheme. This is a huge mountain, whose lower slopes on one side I have wandered among; Marta was approaching from the other side, and I am afraid that the landscape was almost completely unfamiliar to me. The q-Askey scheme is a poset with 29 boxes, each containing a type of orthogonal polynomials, with Askey–Wilson and q-Racah at the top. She talked mostly about the left-side, and I realised that the polynomials I am more familiar with, such as Krawtchouk and Hahn, were all on the other side. Her talk was full of discrete versions of Painlevé equations (six important families of non-linear differential equations), and was described by “chewing gum moves” on a certain Riemann surface; she has a duality which produces new information.

At lunchtime the rare books collection had put on a very nice exhibition, with works of (among many others) Pacioli, Fermat, Gauss, and Mary Somerville (who grew up in Burntisland).

After lunch, Nalini Joshi talked about building a bridge (motivated by the famous Harbour Bridge in her hometown Sydney) between apparently unrelated works by Japanese and European mathematicians. The talk began slowly with root systems and reflection groups, but picked up speed; I don’t feel competent to explain much of it, I’m afraid.

Then, not entirely by design, I found myself in the Algebra workshop. Tim Burness began with a talk on the length and depth of finite groups. The length of a finite group is the length of the longest chain of subgroups of the group. This is something I am interested in; in the early 1980s I found a beautiful formula for the length of the symmetric group Sn: take 3n/2, round up, subtract the number of 1s in the base 2 expansion of n, and subtract 1 from that.

The length has many nice properties; it is monotonic, and additive over composition factors. Tim has defined another parameter which he calls the depth, which is the minimum length of an unrefinable chain of subgroups (that is, each one maximal in the next). This failse these nice properties, and behaves very differently from length. For example, the depths of finite symmetric groups are bounded above by 24; the nice proof of this uses Helfgott’s proof of the Ternary Goldbach conjecture! But the depth of all finite simple groups is not bounded above. They have some results for algebraic groups as well (where the subgroups in the chain are restricted to be closed and connected).

Some of my colleagues are interested in something called the partition monoid. Maud de Visscher talked about the partition algebra, a slightly twisted version of the monoid algebra of the partition monoid. This arose in statistical mechanics; they have used it in representation theory, to get a better understanding of certain numbers called Kronecker coefficients arising from diagonal actions of symmetric groups on tensor products of Specht modules.

Finally, Marianne Johnson showed that the 2-variable identities of the bicyclic monoid (with two generators p and q satisfying the relation pq = 1) are identical to those of the 2×2 upper triangular matrices over the tropical semiring. A lovely and unexpected result!

The last event of the day was speed talks by PhD students, but your correspondent’s brain was full, so I cut this session.

Posted in events | Tagged , , , , , , , , | 2 Comments

British Mathematical Colloquium, day 1

The British Mathematical Colloquium began in St Andrews today. I will try to report some highlights, but I am recovering from food poisoning, so my account may be a bit sketchy in some places. Also, of course, there is no promise that I will continue!

As well as standard BMC fare of plenary lectures and “morning lectures”, there are five workshops, in algebra, analysis & probability, combinatorics, dynamics, and (a St Andrews speciality) history of mathematics.

The organisers had worried that attendance might be down: it is at an unusual time of year, convenient for us but less so for some universities south of the border; and St Andrews is fairly remote. But in the event, the main Physics Lecture Theatre was almost full for the opening lecture. We were welcomed by the leading organiser, our Regius Professor, who (after doing an impression of an airline cabin-staff member in pointing out the fire exits) gave way to a recent holder of one of Britain’s other two Regius chairs in mathematics, Martin Hairer.

Martin began by pointing out two guiding principles in probability theory: symmetry (“equivalent” outcomes, such as heads and tails in a coin toss, should have the same probability), and universality (harder to describe, but roughly, if a random outcome depends on many different sources of randomness, its detailed description should not matter too much. His first example, apart from the Central Limit Theorem for something like an infinite sequence of coin tosses, was Brownian motion. The apparently random motion of pollen grains in water was explained in around 1905 by Einstein and Smoluchowsi (independently) as caused by many small nudges caused by random impacts from water molecules; they showed that the distribution should satisfy the heat equation, and this was verified experimentally by Perrin ten years later, the first experimental “proof” of the atomic hypothesis. But five years earlier, Bachelier had investigated the movement of share prices on the stock exchange, and had come to exactly the same conclusion (a precursor of the Black–Scholes equation). The rigorous mathematical description (a measure on the space of continuous functions) was given by Wiener, and the universality proved by Donder.

But there are other universality classes, such as the Ising model close to the critical temperature, whose limiting behaviour is far from “Gaussian”. Its behaviour is conjectured to be universal for many phase transition models, but this has not been proved. Yet another class consists of surface growth models, which describe many physical situations but also the shape formed by random Tetris pieces falling from the sky.

Martin’s title was “Bridging Scales”, and his interest was in process where the large-scale and small-scale scaling limits are known distributions (such as Gaussian and KPZ); what happens on intermediate scales? He and coauthors have a quite general theorem involving solutions of a certain type of stochastic differential equation, and there is not one behaviour, but a family of “canonical” behaviours described by a nilpotent Lie group. But I was floundering at this point.

I spent the next hour and a half in the History workshop, where I will describe only the first talk. Ursula Martin had a research grant to investigate how mathematical impact occurs. She started off by tracing the opposition from earliest times of two views: one expounded by G. H. Hardy (in A Mathematician’s Apology) and Vannevar Bush (whose report, containing very little data, was perhaps the inspiration for setting up the NSF, included the words “scientific progress … results from the free play of free intellects …); the other can be traced back to the founding of the Royal Society in the 17th century (“to extend the boundaries of Empire, and of arts and sciences”), and can be heard in almost every pronouncement of funding bodies today: we should deliver highly skilled people to the labour market, create spin-off companies, and so on.

The conclusions that Ursula and her colleague Laura Meagher came to were interesting. In a previous paper by Meagher and Nutley, impact was classified into five types (and reading this you see how impoverished the REF definition of impact is): conceptual, instrumental, capacity building, attitude or cultural change, and enhancing connectivity. Apart from the fact that essentially only the second and third count for REF, Meagher and Martin found that the REF protocols reinforce the myth that impact only happens in a linear order, whereas in fact it is a tangled web whose components cannot be separated.

The take-home messages were that impact is about people, not processes, and requires “knowledge intermediaries” rather than technology transfer offices; and, most important, we need more good stories.

The final talk of the day was the public lecture by Julia Wolf, on finding structure in randomness. It had attracted a fair number of people who were not BMC delegates (she asked for a show of hands, and was relieved at the result).

Her message was: if our object is random, or even just “looks like random” (technically quasi-random), then we can learn a lot about it, in particular counts of subconfigurations; if it is not random, then it is structured, and this gives us a lot of information; but even these two principles are not universally applicable. Her examples were taken from graphs and sets of integers. (Despite my last-but-one post, there is of course no problem in choosing a random set of integers!) For graphs, there is the theorem of Fan Chung, Ron Graham and Richard Wilson: if a graph has edge density p and has the number of 4-cycles which a random graph with edge probability p would have, then it shares many properties with random graphs. She said a few words, which actually made this clearer to me than I have ever understood before: the number of 4-cycles in a graph with given edge density is minimised by the random graph, and therefore interesting stuff clusters at that point.

A couple further snippets: She described the Green–Tao theorem on primes in arithmetic progression, and mentioned that the longest known such progression has length 26, but finding one for length 27 is completely out of computational reach; the Szemerédi regularity lemma says that, given ε, any large graph can be dissected into a number of pieces bounded by a function of ε such that the edges between almost all pairs of pieces form quasi-random bipartite graphs, but the function of ε is a tower of 2s of height ε−2; and there is a possible game-changer on the horizon, the polynomial method recently used so spectacularly by Croot, Lev and Pach and by Ellenberg and Gijswijt (which I described here).

Then there was a welcoming wine reception, but I had had enough for the day and slipped away.

Posted in events | Tagged , , , , , | Leave a comment

Microsoft buys GitHub?

Some years ago, it was announced that Microsoft had done a deal with Donald Knuth and the American Mathematical Society and had bought the rights to TeX. It was slightly worrying until you noticed that the announcement was dated 1 April.

Now I see that the company which has fought a long and bitter war against open source software is to buy GitHub, according to the BBC news. And April is long past.

When Microsoft bought Skype, I found it would no longer run on my Linux laptop. So should we be worried that they have bought GitHub? I don’t use it myself, but a lot of the excellent free software that I do use, such as GAP, is now developed there. Are they buying it to destroy the competition, as the railways did to the canal network in Britain in the nineteenth century?

I have great admiration for the boss of Microsoft, but nothing but fear and loathing for his company.

Posted in software, technology | Tagged , , , | 5 Comments

The infinite lottery

Probability is a mathematical subject where the application of the results to the real world remains controversial, and different schools of thought flourish.

Littlewood, in his Miscellany, discusses this, and comes firmly to the conclusion that probability theory can say nothing about the real world, and that a pure mathematician should “wash his hands of applications”, or if talking to someone who wants them, should say something along the lines of “try this; it is not my businesss to justify [it]”. However, later in the book, in the section on large numbers, he is happy to estimate the odds against a celluloid mouse surviving in Hell for a week. (Probably he would have argued that this is not the “real world”.)

That said, almost all mathematicians regard Kolmogorov’s axioms as the basis for probability theory. Littlewood says that “the aspiring reader finds he is expected to know the theory of the Lebesgue integral. He is sometimes shocked at this, but it is entirely natural.”

I had a project student this year who wrote about developing probability theory for non-classical logics such as intuitionistic or paraconsistent logic. I certainly learned a great deal about non-classical logic from supervising the project, for which I am very grateful to her; I might say more about this later. What I have to say below has nothing to do with her project, but the idea was sparked by a section in one of the papers she unearthed in her literature search. This is, perhaps, my small contribution to the foundations of probability theory.

Kolmogorov’s axioms say, in brief, that probability theory is measure theory on a space with total measure 1. In more detail, the ingredients are (Ω,F,p), where

  • Ω is a set, which probabilists might call the sample space, but I rather like the philosophers’ term for this, the set of possible worlds;
  • F is a σ-algebra of subsets of Ω that is, F is non-empty, contains Ω, and is closed under complementation and countable unions;
  • p is a function from F to the real numbers satisfying
    • p(S) is non-negative for all S in F;
    • p(Ω) = 1;
    • p is additive over countable disjoint unions; that is, if the sets Si are pairwise disjoint for iN, then p(∪Si) = Σp(Si).

The paper in question is “Kolmogorov’s axiomatization and its discontents”, by Aidan Lyon, in the Oxford Handbook of Probability and Philosophy. Section 4 of the paper takes Kolmogorov’s axioms to task because they cannot model a lottery with a countably infinite number of tickets, which is fair in the sense that each ticket has the same chance of winning.

Right from the start, this struck me as odd. One of the themes of mathematical philosophy is that mathematicians are not careful enough in their work, and accept arguments without sufficient scrutiny. The proportion of people who support some form of constructivism is probably much higher among philosophers of mathematics than among mathematicians. (Typically, they reject proof by contradiction: a proof of a theorem must be a construction which demonstrates the theorem, it is not enough to show that assuming that the theorem is false leads to a contradiction.)

But here, it seems to be the other way around. I cannot imagine any construction which would actually implement a fair lottery with a countably infinite number of tickets. To point the contradiction, Kolmogorov’s axioms have no difficulty in handling the result of tossing a fair coin countably many times. This generates the set of all countable 0-1 sequences, interpreting 1 as “heads” and 0 as “tails”. Now the measure on this set is the standard product measure induced from the measure on {0,1} where each of the sets {0} and {1} has probability 1/2.

Here is the argument that a fair countable lottery is not possible. The sample space is the countable set of tickets. (Said otherwise, there is a possible world in which any ticket is the winning ticket.) We want all these outcomes to have the same probability p. Now the sample space Ω is the union of the singleton sets {Ti} for iN, which are pairwise disjoint; so, by countable additivity, if we add up p a countable number of times, the result should be 1. But this is not possible, since if p = 0, then the sum is 0, whereas if p > 0, then the result is infinite.

(The first part of this argument generalises to the fact that a countable union of null sets (sets with probability 0) is a null set.)

The difference between the two situations, it seems to me, is that I can imagine a mechanism that allows a coin to be tossed infinitely often. If we are going to have infinite processes in mathematics, this seems a fairly harmless one. But I am unable to visualise a fair method of picking one lottery ticket from a countable set. One could try the following: For each ticket, toss a fair coin infinitely often to generate the base 2 expansion of a real number in the unit interval; then the winner is the ticket whose number is the largest. Trouble is, there may not be a largest; the set of chosen numbers has a supremum but possibly no maximum.

Another way of describing the same process is like this. Take a fair coin, and toss it once for each ticket. Any ticket which gets tails is eliminated, those which get heads remain. Repeat this process until a single ticket remains, continuing into the transfinite if necessary (but there is no guarantee that this will happen!)

Littlewood gives an argument which shows that, if we really want a countable fair lottery, we have to give up either finite additivity or the property that the probability of the whole space Ω is 1. Take the following countable pack of cards: each card has consecutive natural numbers written on its two sides, and the number of cards bearing the numbers n and n+1 is 10n. There are two players, Alice and Bob. A card is chosen randomly from the pack and held up between them, so that each can see one side, and they are invited to bet at evens that the number on their side of the card is smaller than the number on the other side. If Alice sees the positive integer n, it is ten times as likely that the number on the other side is n+1 than that it is n−1. So she should bet. Bob reasons in the same way. So each has the expectation of winning.

Littlewood describes this as a “paradox”; I think it is rather more than that.

Lyon claims that it is not necessary to be able to imagine a mechanism in order for the objection to Kolmogorov’s axioms to be valid. I don’t think I agree.

Anyway, Lyon describes a possible solution of using non-standard real numbers as probabilities, so that the probability of picking a ticket in the lottery is a suitable infinitesimal with the property that adding it countably many times gives 1. But he rightly remarks that there are problems with this solution: lack of uniqueness of non-standard models and inability to name the elements.

When I read this, it struck me that there was a potentially nicer answer to the question: use John Conway’s surreal numbers as probabilities. These numbers are explicitly defined and named; morever there is a unique surreal number 1/ω with the property that adding it to itself ω times will give 1.

Conway’s number system comes out of his general framework for game theory, described in his book On Numbers and Games. The first account of it was the mathematical novel Surreal Numbers by Don Knuth; he describes two backpackers who unearth an ancient tablet bearing the signature “J.H.W.H. Conway” and giving the rules for the construction of Conway’s number system, and their elation and frustration as they decipher the tablet and try to prove the assertions on it.

Has anybody seen a treatment of probability using Conway’s surreal numbers? It seems to me that in order to have such a treatment, one would presumably need to modify the notion of “null set”, to address the fact that the union of countably many “null sets”, such as the lottery tickets with probability 1/ω, may not be null; there may be difficulties lurking here. The problem of inventing a mechanism for the lottery still remains, but maybe closer inspection of the way the surreal numbers are generated would suggest a method.

Actually, the sting in the tail of this story is that there are some absolutely classical mathematical results which assign “probabilities” to sets of natural numbers. For two famous examples, which incidentally lead to the same value,

  • the probability that a natural number is squarefree is 6/π2;
  • the probability that two natural numbers are coprime is 6/π2.

This uses a completely different notion of probability. The statements in question really assert that the proportion of natural numbers up to n which are squarefree (or the proportion of pairs which are coprime) tends to the limit 6/π2 as n→∞.

Posted in exposition | Tagged , , , , | 6 Comments