Ambiguity

JoAnne Growney has gently taken me to task for saying that there is no ambiguity in mathematics. So here I will play devil’s advocate, and examine two famous equations for traces of ambiguity. For the first, I might have taken the chorus from The Animals’ song “Roberta”: “One and one is two, two and one is three”, but instead I will consider Russell’s equation 2+2 = 4, so-called because it is proved, after hundreds of pages of preliminaries, in Russell and Whitehead’s Principia Mathematica. Russell himself said,

“3” means “2+1”, and “4” means “3+1”. Hence it follows (though the proof is long) that “4” means the same as “2+2”. Thus mathematical knowledge ceases to be mysterious.

The equation I will consider is Euler’s equation e = –1, which you can see in poetic form on JoAnne’s blog.

Before I begin, I should refer non-mathematicians reading this (are there any?) to Barry Mazur’s book Imagining Numbers (particularly the square root of minus fifteen), published by Farrar, Straus and Giroux (and by Penguin in Britain). (He explicitly warns mathematicians that the book is not for us.) In it he compares the task of the book’s title with that of imagining “the yellow of the tulip” (the phrase is from John Ashbery’s poem “Whatever it is, whatever you are”, in A Wave: Poems by John Ashbery, Farrar, Straus and Giroux, 1985. When you read the latter phrase, you probably saw without any conscious effort a mental picture of a yellow tulip. However, for the former phrase, it took mathematicians a couple of centuries of effort to create a mental picture. This will be relevant for Euler’s equation.

Russell’s equation

A non-mathematician probably sees no difficulty with the equation 2+2 = 4, and wonders how there can be any ambiguity here. But it is not so simple for mathematicians, since we need to know what is meant by the numbers 2 and 4, the operation +, and the relation =.

So I will begin with a lightning tour of the construction of the number systems. I will take a logical rather than historical view, since I am interested in what the equation means to a mathematician.

We begin with the natural numbers (including 0). The most important thing about these is their succession: they begin 0, 1, 2, 3, 4,… More formally, we want the number 3 (for example) to be a standard 3-element set against which we can compare other sets (rather like the standard metre); and from a logical point of view we want to build it out of what we have already built, which at this point consists of the numbers 0,1,2. So we define 3={0,1,2}, and similarly for other numbers. In particular, we have nothing available when we are constructing zero, so define it to be the empty set.

Now we define addition recursively: m+0 = m, and m+s(n) = s(m+n), where s denotes the successor function. With this definition it is possible to prove that Russell’s equation holds.

Already we have a couple of problems. First, the Russell–Whitehead definition of the numbers 2 and 4 is quite different from the one presented here. At the least, a different definition would need a different proof; also, what guarantee is there that we are talking about anything like the same thing?

Second, and related, a lay person thinks of Russell’s equation as saying “two apples plus two oranges equals four fruit”. In other words, and more generally, if two disjoint sets A and B are bijective with the natural numbers m and n respectively (as we’ve defined them), then the union of A and B is bijective with m+n.

But we need more than the natural numbers to do mathematics, so we have to extend our number system. The extensions are motivated by equations we can’t solve in the existing system. Thus, there is no natural number n satisfying 5+n=3; we adjoin a solution of this equation, represented by the ordered pair (3,5) (suggesting 3–5). Of course, this number is also represented by (100,102) and many other pairs; so it is really an equivalence class of ordered pairs. In set theory, an ordered pair is a set of sets; so any integer is a set of sets of sets.

It is possible to define the order relation, and the arithmetic operations, on the integers as thus defined; I skip the details here.

But now we have re-constructed the natural numbers; the number 2, for example, is an equivalence class containing the pairs (2,0), (3,1), (4,2), and so on. This gives us the problem of proving again Russell’s equation for this new interpretation of 2, 4 and +. We actually show that the map from natural numbers to integers taking the natural number n to the equivalence class containing (n,0) is an isomorphism (preserves addition and multiplication).

Now we pass from the integers to the rational numbers by adding solutions to equations like 4n=2; from the rational numbers to the real numbers by either of two somewhat complicated methods, Dedekind cuts and Cauchy sequences; and from the real numbers to the complex numbers by adjoining a solution of i2=–1. (The last step is actually the easiest.) At each step, we need to show that an isomorphic copy of the previous system is contained within the new one; in other words, re-verify Russell’s equation and similar results. We have an additional burden of ambiguity when we construct the real numbers, since a real number such as π might be a Dedekind cut or an equivalence class of Cauchy sequences; these are very different mathematical objects.

We conclude that, while the meaning of the symbols in Russell’s equation is hugely ambiguous (we might write some variants of it as (+2)+(+2) = (+4), (2/1)+(2/1) = (4/1), 2.0+2.0 = 4.0, and (2+0i)+(2+0i) = (4+0i)), operationally it is completely unambiguous; two plus two is always four!

To make the point another way, suppose that we are working in a completely different system, the integers mod 3. Then 2+2 = 1. But there is no conflict, since 4 = 1 in this system.

Euler’s equation

For Euler’s equation, the basic problems are similar: what are e, π, and i, and what does exponentiation by a complex number mean? The first part is not difficult, but I will spend a little time on e, since this is relevant to the second question.

There is no problem explaining what c2, or indeed any positive integral power of c means, for any number c. For the time being I will assume that c itself is a positive real number, since a different kind of ambiguity occurs otherwise.

Zero and negative powers are easy: we put c0 = 1 and c-n = 1/cn.

We can define cp/q, for any rational number p/q, to be the unique positive real number d such that dq = cp.

Some new principle is required for irrational powers cx. We would like to say: approximate x by rational numbers r, and then let cx be approximated by the rational powers cr. This procedure is potentially highly ambiguous; we need to show that the result doesn’t depend on the process (essentially the statement that cx is a continuous function of x.)

After each of these steps we need to verify various properties (the laws of exponents), for example cx·cy = cx+y.

Exponentiation with a complex exponent is more difficult. We could invoke a principle of complex analysis: once cx is defined for all real x, there is at most one analytic extension to the complex numbers. I will use a little less analysis here. First we return to the number e. I will tell the story in an ahistorical way; for a more accurate form, see Eli Maor’s book e: The story of a number.

The makers of tables of antilogarithms (values of 10c) noticed that 100.001 is approximately 1.0023, and more generally 10x is approximately 1+2.3x for small x. Then values for larger x can be found by taking powers. The constant 2.3 is inconvenient, and it would be nicer if instead it were 1. To obtain this, the base 10 should be replaced by a certain number e, whose value is about 2.71828182845. The upshot is that the function ex is equal to its derivative, for real x.

Suppose we want to extend this property, and the laws of exponents, to the complex numbers. First we see that we need to put ex+iy = ex·eiy, so we only need to make a definition for purely imaginary exponents. Next we note that differentiating eiy multiplies it by i. Now the function cos(y)+i sin(y) happens to have the same property, and also takes the value 1 when y=0. A simple version of the uniqueness theorem for solutions of differential equations gives us the famous formula eiy = cos(y)+i sin(y).

In particular, putting y=π gives Euler’s equation.

However, a final remark about the base is required. If x is positive, then it has two square roots; we somewhat arbitrarily chose x1/2 to be the positive one. This does involve a choice, but there is a natural thing to do. If the base is not a positive real number, there will similarly be choices, sometimes infinitely many! For example, the square root of –1 is either i or –i, and there is no way of deciding which one should be (–1)1/2. A more elaborate example is a problem of some historical interest, the computation of ii. Since i=eiπ/2, we would expect that ii=e-π/2. However, the sine and cosine functions repeat their values with period 2π so it is equally true that i=eiπ(2n+1/2), so that ii can take any of the infinitely many distinct real values e-(2n+1/2)π.

The resolution of this ambiguity can be done in either of two ways. Either we make a truly arbitrary choice to restrict the arguments of sine and cosine to a particular interval such as (–π,π]; or else we take the more complicated approach of saying that our function is defined not on the complex numbers but on a more complicated gadget called a Riemann surface. But that really is another story!

Advertisements

About Peter Cameron

I count all the things that need to be counted.
This entry was posted in exposition, mathematics and .... Bookmark the permalink.

6 Responses to Ambiguity

  1. Apologies for the proofreading errors in the first version I posted. I am doing the editing on a netbook in a small Italian hill town, without benefit of a good editor. I hope it is now fixed!

  2. Sean Carmody says:

    Peter, one thing I thought you might touch on in the 2+2=4 discussion is the idea of the abstract properties of addition (that’s even more abstract that Russell’s definition!). What I mean by that is while you seem to have to keep proving 2+2=4 in every different context (set definitions, equivalence class pairs, rationals, etc), all you really need to know is that m+0 = m, and m+s(n) = s(m+n) still holds (perhaps as a property rather than a definition) and your original proof carries over to the new context.

  3. Perhaps, playing devil’s advocate, I was making things harder than they need to be. Logically we are producing a new proof of Russell’s equation in each new number system; in fact, we don’t even have to re-verify the definition. All that is necessary is to prove that, if a+b=c holds in the old system, then it holds in the new; this is actually quite straightforward.

  4. The poetic version of Euler’s equation, entitled “The transendence of Euler’s formula”, is here.

  5. JoAnne says:

    My desktop dictionary (1986 Webster’s New World) describes “ambiguity” as “the quality or state of being ambiguous” and then for “ambiguous” gives a first and second choice:
    (1) having two or more possible meanings
    (2) not clear; indefinite; uncertain; vague.
    I propose that mathematics, like poetry, is rich with the first type of ambiguity and endeavors to eschew the second.
    JoAnne Growney http://poetrywithmathematics.blogspot.com

  6. Pingback: Ambiguity, 2 « Peter Cameron's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s