Wednesday 12 February 2020

Calibration with Confidence

How can we assess large numbers of objects?

What's the Problem?

Suppose we want to assign scores to a large collection of objects – too large for a single assessor to deal with. We are pretty much forced into having a panel of assessors, each of whom assesses some small collection of the objects. Then the question is, how do we combine their scores to obtain final scores?

There are various possible approaches, depending on whether our final objective is an overall ranking of the objects, or an actual score for each object.

If a ranking is all that is required, then one interesting approach is to have each assessor rank the objects they assess, and then combine the rankings. Of course, the follow-up question that of the best way to combine the rankings. Maybe the most obvious answer is to choose the overal ranking that reverses as few as possible of the assessor rankings. This falls foul of the problem of computational expense: but there is an alternative approach, called Hodge ranking, which makes use of a discrete equivalent to some vector calculus. A nice introduction can be found here.

On the other hand, we may actually want a score. The most obvious approach to this is to take the average (by which I mean the arithmetic mean) of the scores awarded by the various assessors. At first blush, this may look like the best one can do, but it does have a significant problem caused by the fact that not all assessors will score each item.

The problem is that some assessors are generally more generous than others, i.e. may have an intrinsic bias towards scoring objects too high or too low. And since not all assessors provide a score for each object, some objects may get (un)lucky by having only (un)generous markers. It would be good to try to spot and compensate for this bias.

In the rest of this I'm going to describe an approach developed by Robert MacKay of the University of Warwick, and called Calibration with Confidence (I'll explain the name below).

A Statistical Model

We want to get the best information we can out of the scores provided by a panel of assessors.

Of course, the word 'best' is loaded. Best as measured how? To answer that we need a decent mathematical (or statistical, if you will) model of the situation. So here is the model that this approach assumes.

  • each object \(o\) to be assessed has an actual objective numerical value \(v_o\). (I know, I know. It's an imperfect universe.)
  • each assessor has a fixed amount by which they score high or low, their bias \(b_a\).
  • each assessor has a degree of consistency, the extent to which they award same mark to the same object on repeated assessments.

So when an assessor awards a score to an object, they give a score of \[ s_{ao} = v_o + b_a + \sigma_{ao} \varepsilon \] where \(v_o\) is the true value of the object, \(b_a\) is the bias of the assessor, \(\sigma_{ao}\) measures how consistently this assessor would score this object, and \(\varepsilon\) is a normal random variable of mean \(0\) and standard deviation \(1\).

As is the case with all statistical models, this is wrong; however, the assumptions are plausible for it to be useful.

We know what \(s_{ao}\); we want to find \(v_o\) and \(b_a\). But what about \(\sigma_{ao}\)?

This is something we also need to know in order to make good estimates for \(v_o\) and \(b_a\). But how can we know it?

One approach is simply to give up, and assume that they are all the same. In fact, if we do this, the method I'm going to describe here reduces to Fisher's method of incomplete block analysis.

Or we can try to give a plausible estimate of it, which will improve the values we find for \(v_o\) and \(b_a\).

There are a few possibilities even so.

  • Prior information: do we already have reason to believe (from previous experience) that an assessor is likely to be very consistent or very inconsistent?
  • Assessor experience: we assign smaller values of \(\sigma_{ao}\) in cases where we know that assessor \(a\) has lots of experience with the topic of object \(o\).
  • We can ask the assessor to give a range of values rather than a specific score, and use that range to approximate \(\sigma_{ao}\).

However we arrive at this, we define \[ c_{ao} = \frac{1}{\sigma_{ao}^2} \] to be the confidence in assessor \(a\)'s score form object \(o\).

Then our task is to find a good approximation to each \(v_o\) and \(b_a\) (so we are calibrating the assessors) given the values of \(s_{ao}\) and \(\sigma_{ao}\) (which measures the confidence we have in the assessor's consistency, or reliability): this is the explanation I promised about for the name, calibration with confidence.

The Algorithm

So, given the data \(s_{ao}\) and \(\sigma_{ao}\), we want to get the best possible estimate of each \(v_o\) and \(b_a\). One definition of what is best, which has the great advantage of being mathematically tractable, is the values of \(v_o\) and \(b_a\) which miminize \[ \sum c_{ao} (s_{ao} - v_o - v_b)^2. \] This gives higher weight to those contributions for which \(\sigma_{ao}\) is smaller.

A standard bit of calculus tells us that this is minimised if and only if for each object \(o\), \[ \sum_{a:(a,o) \in E} c_{ao}(s_{ao}-v_o-b_a) = 0 \] and for each assessor \(a\), \[ \sum_{o:(a,o) \in E} c_{ao}(s_{ao}-v_o-b_a) = 0 \] where, \(E\) is the set of assesor-object pairs of all the assessments that have taken place so this notation just means that the sums are over all those assessments which have taken place.

Let's just pause for a minute to think about these equations.

We know the values of \(s_{ao}\) and \(c_{ao}\) for each assessment (remembering that not all objects are assessed by all assessors); and we are trying to find the values of \(v_o\) and \(b_a\).

If we set \(c_{ao}=0\) in all the cases where no assessment took place, we can write the above equations slightly more evocatively as \[ \begin{split} \left( \sum_a c_{ao} \right) v_o + \sum_a b_a c_{ao} &= \sum_a c_{ao}s_{ao} = V_o\\ \sum_o c_{ao} v_o + \left(\sum_o c_{ao} \right) b_a &= \sum_o c_{ao}s_{ao} = B_a \end{split} \] where \(V_o\) and \(B_a\) are new names for \(\sum_a c_{ao}s_{ao}\) and \(\sum_o c_{ao}s_{a0}\) respectively.

We can make this look a little neater by expressing the equations in matrix form: \[ L \left[ \begin{array}{c} v\\b \end{array} \right] = \left[ \begin{array} {c} V \\B \end{array} \right] \] where \(L\) is built out of the various \(c_{ao}\) and \(s_{ao}\), and \(v\), \(V\), \(b\) and \(B\) are the vectors whose components are \(v_o\), \(V_o\), \(b_a\) and \(B_a\) respectively.

But now we have to ask the obvious question: does this system of equations have a solution, and if it does, is the solution unique?

The bad news is that the solution to these equations cannot be unique; if we have any solution, we can construct another by adding some number to all scores and subtracting the same number from all biases. But we can deal with this by placing a constraint on the biases, for example (and most simply) by requiring that they add up to \(0\), i.e. by appending the equation \(\sum b_a = 0\) to the above equations.

But does this fix the problem? It removes one source of degeneracy in the equations, but

  • are there other degeneracies?
  • are the equations consistent?

Well, there may be other degeneracies. However, we can remove them by ensuring that the network of assessments satisfies a very plausible condition.

To state this condition, first consider the assessments in the following way. We make a graph by using all objects and all assessors as nodes, and connect any pair by an edge if that assessor scores that object. (This is an example of a bipartite graph.) Then the system of equations will have a unique solution if and only if this graph is connected.

This is something we might have hoped to be true: it is saying that every assessor can be related to every other by some comparison of scores, possibly and indirect comparison via some other intermediate assessors. The miracle is that this is exactly what we need.

Does It work?

The proof of any pudding is in the eating. So, does this work in practice?

This is clearly difficult to answer, because we don't have some kind of magical access to true values and assessor biases to check against.

However, we can at least check that it works against simulated data which satisfy the statistical assumptions of the model. And testing shows that it gives better estimates of true scores than simple averaging, and better estimates of true scores and biases that the incomplete block analysis approach which does not incorporate confidence values.

We can also compare the results with those obtained in real assessment exercises, and it does give different results. Given that we know that some assessors are more generous than others, and that some are more consistent than others, we can be fairly confident that the results are an improvement on simple averaging.

What is it Good For?

There are two situations that spring immediately to my mind in which this approach can be useful:

  • when there are many objects to be assessed, and it is not possible for assessors to assess every object (which is, of course, where we came in).
  • for training assessors, so that a reasonable estimate of assessor bias can be found and this used as useful feedback in the training process.


All the details which have been glossed over above (and many more) can be found in
RS MacKay, R Kenna, RJ Low and S Parker (2017) Royal Society Open Science Calibration with confidence: a principled method for panel assessment.

And finally, if you'd like to play with or explore the algorithm, you can get download it as an Excel spreadsheet or a Python (2.7) package from here.

Tuesday 20 August 2019

Second derivative versus differentiating twice.

How should we think of the second derivative, and what does it tell us?

The Derivative

Before worrying about higher derivatives, let's remember what the first derivative is and does.

We say that a function \(f:I \to \mathbb{R}\) (where \(I\) is some open interval in \(\mathbb{R}\)) is differentiable at \(x \in I\), with derivative \(f'(x)\) if \[ \lim_{h \to 0} \frac{f(x+h)-f(x)}{h} = f'(x). \]

Why do I say \(I\) is an open interval?

It's just to make sure that if \(x \in I\), then for \(h\) sufficiently small, \(x+h\) is also in \(I\). You can also work with the endpoint of an interval, but then you have to worry about one-sided limits, and it's a bit messier. Open intervals are more convenient.

With all this in hand, it's not too hard to show that a function \(f\) is differentiable at \(x\), with derivative \(f'(x)\) if (and only if) \[ f(x+h)=f(x)+hf'(x)+o(h) \] where \(o(h)\) is an error term with the property that \[ \lim_{h \to 0} \frac{o(h)}{h} = 0 \] so that the derivative is the best linear approximation to changes in the values of \(f\), if one exists.

I claimed in an earlier post this this best linear approximation is also the best way to think about the derivative.

The second derivative

So, suppose we have a function \(f:I \to \mathbb{R}\), and it's differentiable at each \(x \in I\). Then we have a new function, \(f'\), given by \(f':I \to \mathbb{R}\), which gives the derivative of \(f\) at each \(x \in I\).

Now, there's no particular reason for \(f'\) to be differentiable. In fact, there's no obvious reason for it even to be continuous, and in fact it doesn't have to be. But it can be, and that's the case we'll think about now.

A good trick is worth repeating. So, given \(f':I \to \mathbb{R}\), it is differentiable at \(x \in I\) with derivative \(f''(x)\) if \[ f''(x) = \lim_{h \to 0} \frac{f'(x+h)-f'(x)}{h} \] and we call this the second derivative of \(f\) at \(x\).

And then it turns out that when this happens, we have \[ f(x+h)=f(x)+hf'(x)+\frac{h^2}{2}f''(x)+o(h^2) \] where \(o(h^2)\) is now an error term which has the property that \[ \lim_{h \to 0}\frac{o(h^2)}{h^2}=0. \]

In other words, the first and second derivative between them give the best quadratic approximation to changes if \(f(x)\). (This idea, and repeating it with more derivatives and higher order polynomial approximations leads to the idea of jets, about which I will say no more: follow the link if you are intrigued.)

This is very nice. We can now think of the second derivative as the correction which gives us a best quadratic approximation to the values of \(f(x)\), and use this best quadratic approximation as an alternative definition: \(f:I \to \mathbb{R}\) is twice differentiable at \(x\) with first and second derivatives \(f'(x)\) and \(f''(x)\) if \[ f(x+h)=f(x)+hf'(x)+\frac{h^2}{2}f''(x)+o(h^2) \]

Except that we can't. Unlike the case of the first derivative, this does not characterize second derivatives. If a function has a second derivative, then we get a best quadratic approximation. But this time we can't make the reverse argument: the existence of a best quadratic approximation does not imply that a function can be differentiated twice.

Monstrous Counter-examples

There's a useful function (at least, useful if you're looking for counter-examples to do with continuity) defined on \(\mathbb{R}\) as follows: \[ f(x) = \left\{ \begin{array}{ccc} 0 & \mbox{if} & x \in \mathbb{Q}\\ x &\mbox{if} & x\in \mathbb{R} \setminus \mathbb{Q} \end{array} \right. \] where \(\mathbb{Q}\) denotes the set of rational numbers. This function has the (maybe surprising) property that it is continuous at \(0\), but nowhere else. (If you enjoy pathological function like this, there are more here.)

We can build on this. Let's think about the function \(g\) given by \(g(x)=f(x)^2\), and think about what happens at \(x=0\). We have \[ g(0+h)= \left\{ \begin{array}{ccc} 0 & \mbox{if} & h \in \mathbb{Q}\\ h^2 &\mbox{if} & x\in \mathbb{R} \setminus \mathbb{Q} \end{array} \right. \]

Then \[ \begin{split} |g(0+h)-g(0)-h \times 0| &= \left\{ \begin{array}{ccc} 0 & \mbox{if} & h \in \mathbb{Q}\\ h^2 &\mbox{if} & x\in \mathbb{R} \setminus \mathbb{Q} \end{array} \right.\\ & \leq |h^2| \end{split} \] and therefore \[ g(0+h)=g(0)+ h\times 0 +o(h) \] i.e. \(g\) is differentiable at \(0\) with derivative \(0\).

But \(g\) is not even continuous at other values of \(x\). So we have a function which is differentiable at just one point.

Let's run with this ball.

Consider the function \(h\) defined by \(h(x)=f(x)^3\).

Again, this function is continuous at \(0\) but nowhere else. Since it isn't continuous away from \(0\), it certainly isn't differentiable in any neighbourhood of \(0\), so it can't be differentiated twice: a differentiable function must be continuous.

But on the other hand, \[ \begin{split} |g(0+h) - g(0) - h \times 0 - \frac{h^2}{2} \times 0| &= \left\{ \begin{array}{ccc} 0 & \mbox{if} & h \in \mathbb{Q}\\ |h^3| &\mbox{if} & x\in \mathbb{R} \setminus \mathbb{Q} \end{array} \right.\\ & \leq |h^3| \end{split} \] so that \[ g(0+h) = g(0) + h \times 0 + \frac{h^2}{2} \times 0 + o(h^2) \] and so \(g\) does have a best quadratic approximation at \(0\), even though it is only differentiable once there.


So, what should we make of this?

We know that a best linear approximation really is a first derivative; being linearly approximable is equivalent to being differentiable.

But this doesn't work for higher degree approximations. A function may be quadratically approximable without being twice differentiable. The two notions are now separate.

We might say that the function has a second derivative, even though it cannot be differentiated twice, but that would probably lead to unnecessary confusion: it's better to accept that the relationship between approximations and derivatives doesn't actually hold for higher derivatives in the same way as it does for the first derivative.

You might think that this doesn't matter much in practice, since the functions we actually use don't have these weird continuity or differentiability properties. I'd be hard put to argue with that.

Nevertheless I think it's an instructive example of something you might reasonably expect to be true turning out not to be, and of the kind of weirdness that you can find when you allow yourself to consider functions that aren't simple combinations of analytic ones.

Sunday 14 July 2019

Infinite arithmetic and the parable of the bingo balls

Hilbert's (or The Infinite) Hotel is a well-known setting for thinking about infinite sets, and in particular for thinking about their sizes. There's a less well-known approach that I think complements it nicely. I don't know who came up with it, so I'll just call it the parable of the bingo balls.

The Setting

Here's the idea. You have two baskets. One of them, basket A, contains a collection of bingo balls, one labelled with each positive integer. The other, basket B, is empty.

At 12 noon, you transfer some of the of balls from basket A to basket B: ball \(1\) up to ball \(n_1\), where \(n_1 \ge 1\). At 12:30, you remove some number of balls from basket B, and transfer some balls, ball \(n_1+1\) to ball \(n_2\), where \(n_2 \ge n_1+1\) from basket A to basket B. At 12:45, you remove some more balls from basket B, and transfer the next collection of balls, always choosing the ones with the the next available labels, from basket A to basket B. You repeat this process whenever the amount of time until 13:00 is halved. Clearly you need infeasibly fast hands to do this in practice: fortunately, we don't need to worry about this, we're just thinking about an ideal mathematical process.

What is in the baskets at 13:00?

It's not hard to persuade yourself that basket A is now empty. Whatever \(n\) you think of, ball \(n\) was removed on or before the \(n^{\text{th}}\) operation. But what about basket B?

It turns out that this depends on the process.

Basket B is empty

In the simplest case, at the first operation you move ball \(1\) from basket A to basket B, and at each subsequent operation you remove from basket B the ball that just got put in, and transfer the next ball from basket A to basket B.

The after each operation, basket B has one ball in it. Indeed, after any number of operations, basket B has one ball in it.

You might naively think that since there is one ball in basket B after every operation, there will be one ball in basket B at 13:00. But funny things happen with infinity.

Ball 1 isn't in basket B, because that was removed at the second operation. Ball 2 isn't, because it was removed at the third. Ball 3 isn't, because that was removed at the fourth. And so on. Whatever ball you name, say ball \(n\), it was removed at operation ball \(n+1\). So after we finish, there are no balls at all in basket B.

A bit more surprising is the fact that I can move as many balls as I like from basket A to basket B at each stage and basket B still ends up empty.

For example, suppose I move two balls from A to B at each stage, so in operation \(n\) I move balls \(2n-1\) and \(2n\) from A to B. Then after \(n\) operations, I've moved \(2n\) balls from A to B, and removed \(n-1\) balls, so there are \(n+1\) balls in basket B. The number of balls in basket B is growing larger and larger. Isn't it obvious that there will be infinitely many balls in basket B when I finish?

But no. We can use exactly the same argument as before. Ball \(1\) is removed at the second operation, ball \(2\) at the third, and so on. Whatever ball you think of, it is removed at some stage (though balls with a big value of \(n\) have to lie about in basket B for quite a while before they are removed).

Something funny is happening here. At stage \(n\), there are \(n+1\) balls in basket B, but after infinitely many steps there are none. Am I trying to say that when \(n\) gets really large, \(n+1\) is somehow getting near to 0?

No, I'm not that crazy. What I'm saying is that the number of balls in basket B at the end cannot be thought of as somehow being approached more closely to by the number of balls in basket B after \(n\) operations, as \(n\) gets bigger and bigger.

It's not hard to see that it doesn't matter how many balls I move from basket A to basket B at each stage, eventually all of them get removed from basket B, and basket B ends up empty.

In fact, I can even delay the process of removing balls from basket B as often as I want, or do it as rarely as I want (as long as I don't stop), and the result is the same. If I wait until operation \((n+1)^2\) to remove ball \(n\), then all the balls in basket B are eventually removed, even though I am putting balls in at every operation, and taking out a single ball at larger and larger intervals.

Here's a fun example. At operation \(1\) I move over one ball, then at operation \(2\) I remove one and move over three, and then at operation \(n\) I remove one and I move over \(n+1\) balls. After then \(n^\text{th}\) operation there are \(n\) more balls than there were before. Then after \(n\) operations, there are \(1+2+\ldots+n\) balls in basket B. So it looks like after I'm finished, there ought to be \(1+2+3+\ldots\) balls in basket B, but in fact there are none.

Of course, I'm not trying to say that \[ 1+2+3+\ldots = 0 \] any more than I would try to say that \[ 1+2+3+\ldots = -\frac{1}{12}. \]

You could even try to persuade yourself that this is reasonable: I put infinitely many in, and I take infinitely many out, so there are none left.

But there's more to it than that.

Basket B isn't necessarily empty

I put in a single ball at each operation. But I wait until after \(N+1\) operations before I start removing balls.

I could do this by first removing ball \(1\), then ball \(2\) and so on, and basket B would end up empty.

But I could also do it by first removing ball \(N+1\) in operation \(N+2\), then ball \(N+2\), and so on. When 13:00 rolls along, every ball with a label of \(N+1\) or more is gone, but balls \(1\) to \(N\) are still there.

So I can arrange for basket B to have any number of balls I want in it by choosing my removal strategy appropriately.

In fact, I can even arrange for basket B to have infinitely many balls in it.

One way to go about it is to move ball \(n\) from basket A to basket B at operation \(n\), and then alternate removing nothing from basket B and removing the next even-numbered ball. This leaves all the odd-numbered balls in basket B at the end, which is an infinite collection.

You may feel that's somehow cheating: you're only taking a ball out on every second operation: but I could also just remove the ball with the next number, so on operation \(2n\) I remove ball number \(n\), and that way I'm left with basket B empty.

Alternatively, I could move two balls from basket A to basket B on each operation, and take out one of those balls at the next operation.

So, the net effect of removing an infinite set from an infinite set can be to leave nothing at all, a finite set of any size you like, or an infinite set of the same size as the one you started with.

What does this tell us?

You can extract more than one possible message from this.

One message is that arithmetic involving the infinite has some counter-intuitive properties.

I think this is the commonest point of view: We have a game with all sort of interesting properties to explore, so we do. It's certainly mine.

If you take this point of view, you find that the infinite sets I've been looking at are only one kind: a set whose elements can be matched up with the positive integers is called countable, and what I've been calling infinite is really countably infinite. There are bigger (and bigger and bigger) kinds of infinity to play with, once you've sorted out what you mean by bigger when you're talking about infinite sets.

If this sounds like fun to you, then you might like to check out the properties of cardinal numbers.

Another is that arithmetic involving the infinite has unacceptable properties, so you reject it as a part of mathematics.

Some people take this point of view very seriously. They will accept the notion of integers of any size, no matter how big, but regard it as illegitimate to talk about a set of all integers, because that would have infinitely many elements, and they find the price too bit to pay.

If that sounds like you, or like you want to be, you might find it interesting to check out finitism .


Thanks to Adam Atkinson for pointing out some ambiguity in the first version.

Saturday 25 May 2019

A second look at complex differentiation

What does it mean to differentiate a complex function?

If we have a complex function, \[ f:\mathbb{C} \to \mathbb{C}, \] then we can try to define a derivative by analogy with the real version: we say that \(f\) has derivative \(f'(z)\) at \(z\) if \[ \lim_{h \to 0} \frac{f(z+h)-f(z)}{h} = f'(z). \]

The catch is that now \(h\) is a complex number. For this to make sense, it must give the same answer no matter how \(h \to 0\). The usual thing is then to insist that you get the same answer if \(h\) is pure real or pure imaginary, and find some constraints that have to be satisfied: these are the Cauchy-Riemann equations.

If you're of a suspicious turn of mind, you might ask why it's enough for it to work for \(h\) pure real or pure imaginary: that makes \(h\) approach \(0\) along one of two rather special lines. What if \(h\) doesn't do that? Also, this argument shows that if \(f\) is complex differentiable, it must satisfy the Cauchy-Riemann equations: but it doesn't tell us that if the Cauchy-Riemann equations are satisfied, then \(f\) must actuallly be complex differentiable.

But there's another approach, which I think has a few advantages.

  • It's based on a general definition of derivative, rather than guessing something that 'looks like' the real (variable, that is) derivative.
  • It doesn't relay on \(h\) approaching \(0\) in any particular way.
  • It shows that the Cauchy-Riemann equations are necessary and sufficient for complex differentiability.
  • I just think it's prettier.

So, let's recall what the derivative really is: if you have a function \(\pmb{f}:\mathbb{R}^n \to \mathbb{R}^m\), so\(\pmb{f}\) is a vector with components \(f^1 \ldots f^m\), then \(\pmb{f}\) is differentiable at \(\pmb{x}\) with derivative \(D\pmb{f}(x)\) (an \(m \times n\) matrix) if \[ \pmb{f}(\pmb{x}+\pmb{h}) = \pmb{f}(\pmb{x})+D\pmb{f}(\pmb{x})\pmb{h}+o(\|\pmb{h}\|). \]

Then the \(i,j\) element of \(D\pmb{f}(x)\) is \(\partial f^i/\partial x^j\).

In the case where \(\pmb{f}:\mathbb{R}^2 \to \mathbb{R}^2\), we can be much more explicit than that. We write \[ \pmb{f}(\pmb{x} ) = \left( \begin{array} {c} u(x,y)\\ v(x,y) \end{array} \right) \] where \[ \pmb{x} = \left( \begin{array} {c} x\\ y \end{array} \right) \] and then \[ D\pmb{f} = \left( \begin{array}{cc} \frac{\partial u}{\partial x} & \frac{\partial u}{\partial y}\\ \frac{\partial v}{\partial x} & \frac{\partial v}{\partial y} \end{array} \right) \] has the property that \[ \begin{split} \left( \begin{array}{c} u(x+h_x,y+h_y)\\ v(x+h_x,y+h_y) \end{array} \right) &= \left( \begin{array}{c} u(x,y)\\ v(x,y) \end{array} \right) + \left( \begin{array}{cc} \frac{\partial u}{\partial x} & \frac{\partial u}{\partial y}\\ \frac{\partial v}{\partial x} & \frac{\partial v}{\partial y} \end{array} \right) \left( \begin{array}{c} h_x\\ h_y \end{array} \right) +o(\sqrt{h_x^2+h_y^2})\\ &= \left( \begin{array}{c} u(x,y)\\ v(x,y) \end{array} \right) + \left( \begin{array}{c} \frac{\partial u}{\partial x}h_x + \frac{\partial u}{\partial y} h_y\\ \frac{\partial v}{\partial x}h_x + \frac{\partial v}{\partial y} h_y \end{array} \right) +o(\sqrt{h_x^2+h_y^2}). \end{split} \]

But now we can define the complex number \(z=x+iy\), and the complex function \(f:\mathbb{C} \to \mathbb{C}\) by \[ f(x+iy) = u(x,y)+iv(x,y). \] So \(f\) will be differentiable at \(z\) with derivative \(f'(z)\) if \[ f(z+h)=f(z)+f'(z)h+o(|h|) = f(z)+f'(z)h + o(\sqrt{h_x^2+h_y^2}) \] where \(h=h_x+ih_y\).

So what is this \(f'(z)\)?

Let's call it \(d_x+id_y\), so that \(d_x\) and \(d_y\) are the real and imaginary parts of \(f'(z)\) respectively. Then we have \[ \begin{split} f'(z)h &= (d_x+idy)(h_x+ih_y) \\ &= d_x h_x-d_y h_y + i(d_y h_x + d_x h_y). \end{split} \]

So now we have two different ways of describing the same function: once as a function \(\mathbb{R}^2 \to \mathbb{R^2}\), and once as a function \(\mathbb{C} \to \mathbb{C}\). But it's the same function, both times, so the derivative must really be the same. In other words, since the expressions must match for all possible \(h_x\) and \(h_y\), \(\pmb{f}\) can be thought of as a complex differentiable function if and only if \[ \begin{split} \frac{\partial u}{\partial x}h_x + \frac{\partial u}{\partial y} h_y &= d_x h_x-d_y h_y \\ \frac{\partial v}{\partial x}h_x + \frac{\partial v}{\partial y} h_y &= d_y h_x + d_x h_y. \end{split} \]

Matching these up, then, we have the conditions \[ \begin{split} d_x &= \frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}\\ d_y &= \frac{\partial v}{\partial x} = -\frac{\partial u}{\partial y}. \end{split} \]

And putting it all together, we see that the real function given by \((x,y) \to (u(x,y),v(x,y))\) can be thought of as a complex differentiable function \(x+iy \to u(x,y)+iv(x,y)\) if and only if \(u\) and \(v\) satisfy the Cauchy-Riemann equations \[ \begin{split} \frac{\partial u}{\partial x} &= \frac{\partial v}{\partial y}\\ \frac{\partial v}{\partial x} &= -\frac{\partial u}{\partial y}. \end{split} \]

I, at least, find this more satisfying.

Monday 29 April 2019

Cauchy and the (other) mean value theorem.

I just realised that there's a neat relationship between Cauchy's integral formula and the mean value theorem for harmonic functions of two variables, so for the benefit of anybody else who hasn't already noticed it, here it is.

First, let's remember what the Cauchy integral formula tells us.

If \(f\) is a holomorphic function on a simply connected domain, \(D\), \(a \in D\), and \(C\) is a simple closed contour in \(D\) surrounding \(a\), then \[ f(a) = \frac{1}{2\pi i} \oint_C \frac{f(z)}{z-a} dz. \]

I confess, this always seems like magic to me.

One way of seeing that it is true is to note that for a holomorphic function continuous deformation of the countour doesn't change the integral: therefore we can take \(C\) to be an extremely small circle centred on \(a\), so that \(f(z)\) is very close to \(f(a)\) all around the contour. Then if we parameterise \(C\) as \(a+\varepsilon e^{i \theta}\), where \(0 \leq \theta < 2\pi\), the integral is approximately \[ \frac{f(a)}{2 \pi i} \oint_C \frac{1}{z-a} dz = \frac{f(a)}{2 \pi i} \int_0^{2 \pi} i d\theta = f(a) \] and the approximation can be made arbitrarily good by choosing \(\varepsilon\) small enough.

And even with that, it still seems like magic.

Now, let's take a look at this in terms of real and imaginary parts. We can think of the complex number \(z\) as \(x+iy\), and then thinking about \(f\) as a function of \(x\) and \(y\). As is customary, I write: \[ f(z) = u(x,y)+iv(x,y) \] where \(f\) being holomorphic tells us that \(u\) and \(v\) satisfy the Cauchy-Riemann equations, \[ \frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}, \qquad \frac{\partial u}{\partial y} = - \frac{\partial v}{\partial x} \] and as a consequence \[ \frac{\partial^2 u}{\partial x^2}+\frac{\partial^2 u}{\partial y^2} = 0 = \frac{\partial^2 v}{\partial x^2}+\frac{\partial^2 v}{\partial y^2} \] so that both \(u\) and \(v\) are harmonic functions.

It's less obvious, but equally true that if \(u\) is a harmonic function of \(x\) and \(y\), then there is another harmonic function \(v\) such that \[ f(z)=f(x+iy) = u(x,y) + iv(x,y) \] is holomorphic.

So now let's see what the Cuachy integral formula says, in terms of \(u\) and \(v\).

To this end, let \(C\) be a circular contour of radius \(R\), centred on \(a\), and parameterised by \(\theta\) as \[ a+Re^{i\theta}, \qquad 0 \leq \theta < 2\pi \]

Then we have \[ \begin{split} \frac{1}{2 \pi i} \oint_C \frac{f(z)}{z-a}dz &= \frac{1}{2 \pi i} \int_0^{2 \pi} \frac{f(z(\theta))}{Re^{i\theta}} iRe^{i\theta} d\theta \\ &= \frac{1}{2\pi} \int_0^{2\pi} u(x(\theta),y(\theta))d\theta + i \frac{1}{2\pi} \int_0^{2\pi} v(x(\theta),y(\theta))d\theta \end{split} \]

So the real part of \(f(a)\) is the mean value of \(u(x,y)\) on a(ny) circle centred on \(a\), and the imaginary part is the mean value of \(v(x,y)\) on a(ny) circle centred on \(a\).

But the real part of \(f(a)\) is just \(u(Re(a),Im(a))\), and similarly for \(v\)!

This result, that the value of \(u(x,y)\) for harmonic \(u\) is the mean value of \(u\) on any circle centred on \((x,y)\), is the mean value theorem for harmonic functions.

We can go backwards: assuming the mean value theorem for harmonic functions, we can deduce the Cauchy integral formula (at least, if we are happy that contour integrals of holomorphic functions are unchanged by a deformation of the contour).

So it turns out that Cauchy's integral formula, a fundamental result in complex analysis, is equivalent to a real analysis result about the properties of harmonic functions.

In retrospect, I suppose it should be obvious in some sense that Cauchy's integral formula relies on the properties of harmonic functions, since holomorphic functions are built out of harmonic ones. But I hadn't put the two ideas together like this before.

Tuesday 23 April 2019

Trig functions with (almost) no triangles

In this post I'm going to look at what we can say about the trigonometric functions, \(\sin\) and \(\cos\) without actually doing any classical trigonometry, in other words, without using triangles or ratios of sides - if we know enough other maths. Note that this is not a "we should teach it this way" article it's an "isn't it great how mathematics all hangs together" article.


We start off with a little Euclidean geometry, as described in Cartesian coordinates.

Consider the unit circle in the Euclidean plane, centred on the origin, so given by \[ x^2+y^2=1 \]

and define \(\cos(t)\) and \(\sin(t)\) as the \(x\) and \(y\) coordinates of the point that you get to on this circle if you start at the point \((1,0)\) and travel \(t\) units of length (inches if the circle is one inch in radius, meters if it is one meter in radius, and so on) anti-clockwise around the circle.

This is how I define the functions \(\cos\) and \(\sin\); at some point I should make contact with the usual definitions in terms of ratios of side lengths in right-angled triangles. But not yet.

But we can immediately notice that since I've defined these functions as the \(x\) and \(y\) coordinates of points on this circle, we know that \[ \sin^2(t)+\cos^2(t)=1 \] and since the circumference of the circle is \(2\pi\), both functions are periodic with period \(2\pi\).

I can also extend my definition to negative values of \(t\), by thinking of this as distance in the clockwise direction from the same starting point.

Then we also see immediately from the symmetry of the circle that \[ \sin(-t)=-\sin(t), \qquad \cos(-t)=\cos(t) \]


And now a tiny nod towards differential geometry.

Let's think about the circle, where I now parameterise it by the \(t\) I defined above: so we can associate with any value of \(t\), the point \[ \pmb{r}(t) = (\cos(t),\sin(t)) \] (and I commit the usual sin of identifying a point in the plane with its position vector).

So, what can I say about \(\dot{\pmb{r}}\) (where I use the overdot to denote differentiation with respect to \(t\), in the finest Newtonian tradition).

Again, since each point lies on the unit circle, we have \[ \pmb{r}(t).\pmb{r}(t)=1 \] so, differentiating, we have \[ 2\pmb{r}(t).\dot{\pmb{r}}(t)=0 \] and the tangent to the circle at any point is (as we already know from Euclidean geometry) orthogonal to the radius to that point.

From this, we know that \[ \dot{\pmb{r}}(t) = k(-\sin(t),\cos(t)) \] for some \(k\).

We can say more, though. Since \(t\) is distance along the circle, so that for all \(t\) \[ t = \int_{0}^t \|\dot{\pmb{r}}(u) \| du, \] then we must have \[ \| \dot{\pmb{r}}(t) \| = 1. \]

Finally, we note that when \(t=0\), the tangent vector points vertically upwards. This finally leaves us with \[ \dot{\pmb{r}}(t)=(-\sin(t),\cos(t)) \] or, in other words, \[ \frac{d}{dt}\sin(t) = \cos(t), \qquad \frac{d}{dt}\cos(t)=-\sin(t). \]


Now we take a little detour through some real analysis.

For no apparent reason, I introduce two functions defined by power series, \[ \text{SIN}(t) = \sum_{n=0}^\infty (-1)^{n}\frac{t^{2n+1}}{(2n+1)!}, \qquad \text{COS}(t) = \sum_{n=0}^\infty (-1)^{n} \frac{t^{2n}}{(2n)!}. \]

These are new functions, and as far as we know so far, distinct from \(\sin\) and \(\cos\) as I defined them up above. But we'll see in a moment that the choice of names is not a coincidence, and that there is a very close relationship between these power series and the geometrically defined \(\cos\) and \(\sin\).

We can immediately see that \(\text{SIN}(0)=0\) and \(\text{COS}(0)=1\).

But since the two functions are defined by power series, we can differentiate both term by term and find that \[ \frac{d}{dt}\text{SIN}(t) = \text{COS}(t), \qquad \frac{d}{dt}\text{COS}(t)=-\text{SIN}(t). \]

Differentiating again, we find that each of \(\sin\), \(\cos\), \(\text{SIN}\) and \(\text{COS}\) satisfies the ordinary differential equation \[ \ddot{X} = -X. \] Furthermore, both \(\sin\) and \(\text{SIN}\) satisfy the same initial conditions at \(t=0\), namely that the value of the function is \(0\), and its derivative is \(1\); similarly for \(\cos\) and \(\text{COS}\). We therefore deduce that \[ \sin(t)=\text{SIN}(t), \qquad \cos(t)=\text{COS}(t). \]

This gives us two things:

  • An effective way of calculating the functions (at least for small values of \(t\)).
  • Another handle on the functions, which we are just about to use.

Addition Formulae

Finally we make use of just a little complex analysis.

Looking at the power series we now have for \(\sin\) and \(\cos\), and remembering the power series for the exponential function, we have \[ \exp(it)=\cos(t)+i\sin(t). \]

But now we have \[ \begin{split} \cos(s+t)+i\sin(s+t) &= \exp(i(s+t))\\ & = \exp(is)\exp(it)\\ & = \cos(s)\cos(t)-\sin(s)\sin(t)\\ & +i(\sin(s)\cos(t)+\cos(s)\sin(t)) \end{split} \] from which we extract \[ \cos(s+t)=\cos(s)\cos(t)-\sin(s)\sin(t), \qquad \sin(s+t)=\sin(s)\cos(t)+\cos(s)\sin(t). \]

And finally, some triangles

It's time to make contact with the triangle notion of these trigonometric functions, so consider a right angled triangle, and let \(\theta\) be the radian measure of the angle at one of its vertices, \(V\), (not the right angle). Place this triangle in the first quadrant, with the vertex \(V\) at the origin, and the adjacent side along the \(x\)-axis. (This may require a reflection.) Call the vertex on the \(x\) axis \(U\) and the remaining vertex \(W\).

But we can scale the triangle by dividing each side length by the length of the hypotenuse, which doesn't affect any of the ratios of side lengths.

The traditional definition of the sine of angle \(\theta\) is that it is the length of the adjacent side to \(V\) divided by the length of the hypotenuse; similarly \(\cos(\theta)\) is the length of the side opposite to \(V\) divided by the length of the hypotenuse. Then in our rescaled triangle, the vertex \(U'\) corresponding to \(U\) lies on the unit circle, and (by the definition of radian measure of angles) the angle \(\theta\) is the distance around the circle from the point \((1,0)\).

But now, the length of the hypotenuse is \(1\), and so the ratio for the sine of \(\theta\) is just our \(\sin(\theta)\), and similarly for \(\cos(\theta)\).

So, finally, we see that \(\sin\) and \(\cos\) have all the old familiar properties.

I love it when a bunch of different bits of maths come together.


Thanks for @ColinTheMathmo for some useful comments and corrections.

Thursday 21 March 2019

Solar garbage disposal.

Why can't we drop our dangerous waste into the sun?

It's tempting to consider getting rid of dangerous (or undesirable) waste by shooting it into space and letting it fall into the sun; after all, it seems pretty unlikely that we could drop enough of anything into the sun to have an adverse effect on it.

So, just how hard is this?

Let's just start off by thinking about how hard it start off with something in the same orbit as the earth, and knock it out of that orbit into whose closest approach to the sun actually takes it inside the sun. We can worry later about getting it into this initial orbit.

Now, an orbit consists of an ellipse with one focus at the sun. Let's call the distance from the (centre of the) sun at perihelion (the closest point) \(r_1\), and the distance at aphelion (the farthest point) \(r_2\), and the orbital speeds at these positions \(v_1\) and \(v_2\) respectively. Then we have the (badly drawn) situation below:

Perihelion and aphelion convenient points to think about, because the orbital velocities at these points are perpendicular to the lines joining them to the focus. This makes it particularly easy to calculate the angular momentum there.

It's really quite hard to describe orbits properly: finding the position of an orbiting body as a function of time cannot be done in terms of the usual standard functions. However, we can avoid a lot of unpleasantness by being clever with two conserved quantities, the energy and angular momentum of the orbiting body. Since these are both conserved, they always have the value they have at perihelion.

At perihelion, then, the body has angular momentum and energy given by \[ \begin{split} L &= mv_1 r_1\\ E &= \frac{1}{2}mv_1^2 - \frac{GMm}{r_1} \end{split} \] where \(G \approx 6.67 \times 10^{-11}\text{m}^{3}\text{kg}^{-1}\text{s}^{-2}\) is Newton's gravitational constant, and \(M\) is the mass of the gravitating body (which for us will be the sun).

To reduce the algebra a little, we'll take advantage of the fact that everything is a multiple of \(m\), and think of the angular momentum and energy per unit of mass (i.e. per kilogram). That means we think about \[ \begin{split} l &= v_1 r_1\\ e &= \frac{1}{2}v_1^2 - \frac{GM}{r_1} \end{split} \]

A useful observation is that for a closed orbit, we must have \(e<0\); we'll save this fact up for later.

And now we'll do the sneaky thing of finding what the perihelion distance is for an orbit of a given angular momentum and energy.

The idea is simple enough: use the first equation to express \(v_1\) in terms of \(l\), then substitute that into the second equation to get a relationship between \(l\), \(e\) and \(r_1\).

This gives \[ e= \frac{l^2}{2r_1^2} - \frac{GM}{r_1} \] so that \[ r_1^2 +\frac{GMr_1}{e}- \frac{l^2}{2e} = 0 \]

So the solutions are given by \[ \begin{split} r_1 &= \frac{1}{2}\left( -\frac{GM}{e} \pm \sqrt{\frac{G^2M^2}{e^2} +2\frac{l^2}{e} }\right)\\ &= -\frac{GM}{2e} \left(1 \pm \sqrt{1+\frac{2l^2e}{G^2M^2}} \right) \end{split} \]

There are, of course, two roots to this quadratic: they are \(r_1\) and \(r_2\), the aphelion and perihelion values. (We could have gone through just the same process expressing \(l\) and \(e\) in terms of \(r_2\), and obtained the same quadratic.)

We should pay attention to signs here: we are taking a factor of \(|GM/e|\) out of the square root, but since \(e<0\), as mentioned above, this is \(-GM/e\).

Sanity check: for a circular orbit, the radius \(r\) is constant, and from Newton's law we obtain \(v^2/r = GM/r^2\). From this we find \[ e=-\frac{GM}{2r} \] and \[ l^2=v^2r^2 = GMr \] so that \[ \frac{2l^2e}{G^2M^2}=-1 \] and \(r_1=r_2\), i.e. the orbit is indeed circular.

Since \(r<0\), the square root term is smaller than \(1\), and we want the perihelion value, so we have \[ r_1 = -\frac{GM}{2e} \left(1 - \sqrt{1+\frac{2l^2e}{G^2M^2}} \right) \]

Now, let's see what the situation is for starting off with a body in the same orbit as the earth, and trying to adjust it so that it hits the sun. The earth's orbit is very nearly circular, so we'll just assume it is exactly circular, with constant radius and orbital speed.

By the magic of Google (or looking in a book), we find that the mass of the sun is\(M=1.99\times 10^{30}\text{kg}\), the radius of the sun is \(6.96 \times 10^{8}\text{m}\), the orbital radius of the earth is \(1.5 \times 10^{11} \text{m}\) and the orbital speed is \(2.98 \times 10^{4}\text{ms}^{-1}\).

In the diagram below, if the earth's orbit is the black one, we want to divert it to the red orbit by reducing the orbital speed from \(v\) to a smaller value \(u\), so that perihelion would occur inside the sun.

Now, we can set \(r_1 = 6.96 \times 10^{8}\text{m}\), and, noting that \(e\) and \(l\) are functions of the orbital speed, solve numerically for the orbital speeds which give a perihelion distance of less than \(6.96 \times 10^{8}\text{m}\). I asked a computer, and it told me that \(u\) must lie in \((-2858,2858)\), so we have to slow the orbit down by a change of speed \(\Delta v\) of at least \(29800-2858\approx 27000 \text{ms}^{-1}\).

For a kilogram of mass, that means we need to use \((\Delta v)^2/2 \approx 360\) MegaJoules to change its speed by enough. This is quite a lot of energy.

There's also the problem of getting the mass out of the earth's gravitational field: this takes about another \(62\) megajoules.

So in total, we need to use about \(420\) megajoules of energy to drop each kilogram of stuff into the sun.

For comparison, it requires about \(15\) megajoules to transport a kilogram of stuff to the ISS, in low earth orbit\(1\). An example from closer to home is that a megajoule is the kinetic energy of a vehicle weighing one tonne, travelling at about \(161\text{km}\text{h}^{-1}\).

So there we have it: we won't be just dropping our waste into the sun any time soon because apart from anything else, it's just so much work.


  1. You might wonder if it's possible to do better by pushing the object at an angle to its original orbit, rather than just by slowing it down. Unfortunately, the option discussed above is the one requiring the least energy.
  2. I've only considered the situation of giving the object an orbit which falls into the sun. Could one do better by arranging one which flies close past another planet and takes advantage of a slingshot orbit? Yes, and that was used to help send probes to Mercury. This can save energy, but it is a delicate manoeuvre; I'll leave you to try to work out if it might make this form of waste disposal more feasible.


Thanks to Colins @icecolbeveridge and @ColinTheMathmo for comments and feedback.