Tuesday, June 13, 2017

What is this thing called i?

Where did complex numbers come from?

You might think that people invented \(i\) just so that they had a solution to the quadratic equation \(x^2+1=0\). But no: it made perfect sense for a long time just to say that this quadratic (and any other quadratic with a negative discriminant) just doesn't have a solution. There was a much more compelling reason for pretending that there was a number whose square is \(-1\) than just making up solutions to quadratics that don't have them.

A few centuries ago, an algorithm was found to solve cubic equations. But something weird happened: on the way to finding a root of a cubic, sometimes you had to pretend that \(-1\) had a square root, and work with that for a while. It was all right, though, because this non-existent number always cancelled out, leaving behind a root that made sense. The fact that you can find genuine solutions to problems by using these funny objects as book-keeping devices suggested that maybe we should take the book-keeping devices seriously as a new kind of number. And once they were taken seriously, they turned out to be a door into a mathematical paradise of wonderful (and useful) structure.

It is a sad accident of history that this new kind of number, the square root of a negative quantity, has become known as an imaginary number. That gives the impression that they are somehow less real, somehow inferior to the ones that we call real numbers. In fact, they're no more imaginary than negative numbers, non-integer rational numbers, irrational algebraic numbers, or transcendental numbers. All of them are convenient extensions of the natural numbers with interesting and useful properties, and all of them are just as imaginary as the 'imaginary' numbers.

Of course, the imaginary numbers by themselves aren't very interesting or useful. They naturally cropped up in combination with the real numbers in objects of the form \(a+ib\) where \(a\) and \(b\) are real numbers and \(i^2=-1\), which we call complex numbers. \(a\) is called the real part of \(z\) and \(b\) is called the imaginary part. (Yes, the imaginary part is a real number. Deal with it.) If you forget for the moment that there is no such thing as \(i\), you can do algebra with them just by replacing \(i^2\) with \(-1\) whenever it occurs, and that is what was done to solve cubic equations.

This suggests that it's worth taking these complex numbers seriously as objects in their own right.

But what are they?

The simplest answer is basically what I just wrote above. Think of complex numbers as polynomials in \(i\) when you do algebra, and replace \(i^2\) by \(-1\) whenever it occurs. Then you can add and multiply complex numbers without any more difficulty than you have in dealing with polynomials.

If you have two complex numbers \(z=a+ib\) and \(w=c+id\), then \[ \begin{split} z+w&=(a+ib)+(c+id)\\ &=a+c+ib+id\\ &=(a+c)+i(b+d) \end{split} \] and \[ \begin{split} zw&=(a+ib)(c+id)\\ &=ac+ibc+aid+ibid\\ &= ac+i^2bd+bci+adi=(ac-bd)+(bc+ad)i \end{split} \]

And now the first bit of magic happens. Given \(z=a+ib\) we define a new thing called the complex conjugate of \(z\), which is \(\overline{z}=a-ib\). Then multiplying out the brackets gives \[ z\overline{z} = a^2+b^2 \] which is never negative (since \(a\) and \(b\) are both real numbers) and is only zero when \(z=0\). This quantity is so useful it gets a name: \(z\overline{z}=|z|^2\), and we call \(|z|\) the modulus of \(z\). Then we see that \[ z \times \frac{\overline{z}}{|z|^2} = 1 \] which means that we can divide one complex number by another: \[ \frac{w}{z} = w \times \frac{\overline{z}}{|z|^2} \] And, just as with real numbers, this works whenever \(z \neq 0\). Because of the way we've done this, the complex numbers have exactly the same algebraic properties as the real numbers, so all the calculations we could do before, we can still do: but now we can do more.

But what are they really?

So far, this is just a formal game. We pretend that there is a number whose square is \(-1\), and find that we can treat it just like an ordinary number and still do algebra. Amazingly, all seems to work. But it doesn't really tell us what this thing, this complex number, actually is.

One thing we can do is notice that a complex number is built out of two real numbers. We could think of those numbers as the coordinates of a point in the \((x,y)\)-plane, or as the components of the position vector of that point.

Then if we think of \(z\) as \((a,b)\) and \(w\) as \((c,d)\) then we have what we always had, a way of adding vectors: \[ z+w=(a+b,c+d) \] but on top of that, we have a way of multiplying vectors, \[ zw=(ac-bd,ad+bc) \] and the multiplication is algebraically well-behaved.

Looking at it in more detail, we see some more magic.

\(z=(a,b)\) gives us a picture of \(z\) in terms of Cartesian coordinates. But we could just as well use polar coordinates \((r,\theta)\) to represent this point, where \(r\) is the distance from \(z\) to the origin, and \(\theta\) is the angle between the line joining the origin to \(z\) and the \(x\)-axis. \(r\) is just \(|z|\), the modulus of \(z\), and \(\theta\) is called the argument of \(z\).

If \(z\) has modulus \(r\) and argument \(\theta\), while \(w\) has modulus \(\rho\) and argument \(\phi\), it turns out that \(zw\) has modulus \(r\rho\) and argument \(\theta+\phi\).

The multiplication isn't just algebraically well-behaved, it has a nice geometric interpretation.

So we can think of complex numbers as vectors in the plane with a well-behaved multiplication on top of the addition which we can always do for vectors.

But what are they really?

Once we start thinking about vectors, we notice that the real and imaginary parts of \(zw\) look like inner products. With a bit of fiddling about, we finally notice that we can represent \(z\) by a \(2\times 2\) matrix, \[ z=\left[ \begin{array}{cc}a&b\\-b&a \end{array} \right] \] and the matrix addition and multiplication correspond to adding and multplying complex numbers: \[ z+w=\left[ \begin{array}{cc}a&b\\-b&a \end{array} \right] +\left[ \begin{array}{cc}c&d\\-d&c \end{array} \right] =\left[ \begin{array}{cc}a+c&b+d\\-(b+d)&a+c \end{array} \right] \] and \[ zw=\left[ \begin{array}{cc}a&b\\-b&a \end{array} \right] \left[ \begin{array}{cc}c&d\\-d&c \end{array} \right] =\left[ \begin{array}{cc}ac-bd&ac+bd\\-(ac+bd)&ac-bd \end{array} \right] \] Just to check: \[ i^2 = \left[ \begin{array}{cc}0&1\\-1&0 \end{array} \right]^2 = -\left[ \begin{array}{cc}1&0\\0&1 \end{array} \right] = -1 \] So complex numbers are just a particular kind of \(2 \times 2\) matrices, made entirely out of real numbers.

This means that we can get all the algebraic properties of complex numbers without inventing any new, imaginary, numbers, as long as we're willing to thing of the real number \(x\) as the matrix \[ \left[ \begin{array}{cc}x&0\\0&x \end{array} \right]. \]

You may feel that the introduction of matrices is a high price to pay to get rid of non-existent numbers. There is another, more algebraic approach.

But what are they, really?

We can also build complex numbers in a familiar kind of way: it's very similar to congruence arithmetic, sometimes called clock arithmetic. All this means is that we choose an integer \(n \gt 1\), and whenever we do sums, we only keep the remainder when we divide by \(n\). A useful we to think of this is to think of each number as a multiple of \(n\) plus a remainder, then treat \(n\) as \(0\). We call this the arithmetic of \(\mathbb{Z}_n\).

For example, if \(n=3\) the possible remainders are \(0,1,2\),and we denote this set by \(\mathbb{Z}_3\). Then in \(\mathbb{Z}_3\), we have \[1+2=3 = 1 \times 3+0 = 0, \] \[ 2+2=4=1\times3+1= 1, \] and \[ 2 \times 2=4 = 1 \times 3 +1 =1. \]

Now we run very far and fast with this ball.

We start with the set of all polynomials in \(X\) with real coefficients, but we only ever keep the remainder when we divide by \(1+X^2\).

This is a little more abstract looking, but given any polynomial \(P(X)\), we can write it as a multiple of \(X^2+1\) and a remainder. Keeping only the remainder has the same effect as treating \(X^2+1\) as \(0\); in other words, as treating \(X^2\) as \(-1\).

Actually, this is almost back to the beginning, where I said you can just think of a complex number as a polynomial in \(i\) where \(i^2=-1\); the only difference is that now we don't have to pretend that anything squared is \(-1\), we can just work with remainders like in congruence arithmetic.

Again, we haven't introduced any new objects, any imaginary numbers with negative square roots. We've taken mathematical objects we're already familiar with, and seen how to get out of them something which behaves in exactly the same way as these so-called complex numbers.

They are all of the above

There isn't really an answer to the question what are they, really?. They're all of the above. Mathematically, they are all exactly the same, in the sense that you can think of each of them as just an alternative way to write the others. But none of those representations is the 'right' one. They all give exactly the same arithmetic of complex numbers.

So we don't have to worry about 'making up imaginary numbers', and worrying that what we do doesn't make sense because they aren't 'real'. They're just as 'real' (in that they make mathematical sense) as a number whose square is \(2\). (There was some resistance to accepting \(\sqrt{2}\) as a number initially.)

And we also get some accidental benefits. We have different ways of thinking about complex numbers, which means that if we want to do something involving them we can use whichever way of thinking about them is most convenient.

If, that is, we want to do something involving them. Since we aren't sixteenth century mathematicians trying to persuade a nobleman to be our patron by being better at solving cubics than the competition, there remains the question of why we should care.

So what was the point of all that, then?

This is where it actually gets more interesting. The different representations make it natural to ask different questions, which means that we have more avenues for exploration than just having one.


Complex numbers were introduced because they cropped up naturally in the solution of cubic equations. What might we need to introduce to solve higher order polynomial equations? What if we let the coefficients of our polynomials be complex numbers themselves?

The astonishing answer is: nothing. Once you've allowed yourself complex numbers it turns out that you don't need anything else for higher order polynomials, or even polynomials with complex coefficients. The fundamental theorem of algebra tells us that any polynomial of degree \(n\) with complex coefficients (and that includes real ones, since a real number is a complex number with an imaginary part of zero) has \(n\) complex roots.

That's a beautiful mathematical result.

If beauty doesn't motivate you sufficiently, then take solace in the fact that the properties of some differential equations which arise in physics and engineering are determined by polynomials you can build out of them, and the systems modelled by these differential equations are stable if the real part of all the roots are negative. Google for control theory to see lots more.


So we can multiply two dimensional vectors together...what about three dimensional vectors? It turns out that there is no way to do this that is well behaved. But you can do it with four dimensional vectors, if you're willing to lose the commutative law. And with eight dimensional ones, but you lose the associative law. Lots of fun mathematics to play with.

But again, if you want something more useful, the complex numbers can be thought of as determining a magnitude and an angle, or a phase. This turns out to be a useful way of thinking about alternating current, and you can analyse AC circuits using complex arithmetic in much the same way as you use real algebra to analyse DC circuits.

More Algebraic

Can I try the same trick with a different polynomial from \(1+X^2\)?

Can you ever. This procedure is very similar to working in congruence arithmetic. when you work with the remainders on dividing by any \(n\), you always get well-behaved addition, subtraction and multiplication. but division only works if you can't factorize \(n\). In just the same way, with polynomials you always get well-behaved addition, subtraction and multiplication, and if the polynomial you divide by cannot be factorized, you also get division.

This turns out to useful in (for example) the construction of error-correcting codes which have brought us back beautiful pictures from space probes, and are responsible for the robustness of CDs and digital television signals.

So the algebra of complex numbers not only makes sense, it turns out to be useful in a variety of ways.

The miracle

All the above is fascinating stuff, but it misses out a big area of mathematics, namely calculus. Given two complex numbers, \(z\) and \(w\), the distance between them in the plane is just \(|z-w|\): and since we have a notion of what it means for two complex numbers to be close to one another, we can think about what it means for a function to be continuous, or differentiable, and so on.

This leads to some entirely unexpected consequences.

It turns out that if a complex function can be differentiated once, it can be differentiated infinitely often, so we don't get monsters like the real functions which are differentiable just once. It turns out that not only can the function be differentiated infinitely often, it is actually equal to its Taylor series (as long as you don't go too far away from the point where you calculated the derivatives). It turns out that differentiable functions give mappings of the plane that preserve angles, and this can be used to analyse fluid flow and electric fields. It turns out that you can do a kind of integration which lets you compute things that are much harder to do without the complex numbers. It even turns out that they can tell you a lot about the behaviour of the normal integers, which is where the whole thing started.

And all because people in the middle ages had competitions to solve cubic equations.


This whistle stop tour of complex numbers was partly motivated by a Twitter discussion with @SallyBiskupic, @nomad_penguin and @DavidKButlerUoA.


  1. Love this collection of different perspectives.

    Re: "So we can multiply two dimensional vectors together...what about three dimensional vectors? It turns out that there is no way to do this that is well behaved."

    There is a way to multiply vectors in any number of dimensions that is well behaved in that it is associative, invertible, and geometrically meaningful: the geometric product. The one catch is that you don't get closure over vectors--it closes over a larger space with other interesting objects in it.

    I wrote something about why it's nice to be able to multiply and divide by vectors that you might enjoy: http://www.shapeoperator.com/2016/12/12/sunset-geometry/

    There are some references at the end that point to more thorough treatments.

  2. Thanks: I enjoyed putting it together.

    Yes, geometric algebra is an interesting topic, and gives some powerful tools. It's one of the many things I wish I knew more about, so thanks also for pointing me to your demonstration of how powerful it can be. As it happens I'm in the early stages of writing something where a reference to geometric algebra would be very natural, so I intend to give a link to your article (now that I know about it :-)).