'Everybody knows' that the uncertainty principle of quantum mechanics tells us that we can't make a measurement on a system without disturbing it, and that that's why you can't simultaneously measure the position and momentum of a particle. This isn't an entirely unreasonable thing to think: it is, after all, closely related to Heisenberg's original argument involving using light to probe the properties of a particle. But it's wrong. The situation in quantum mechanics is subtler and much more interesting than that.

Statistics rears its ugly head

Suppose we have a large collection of boxes, each containing a particle, and they have been prepared so that the particle in every box is in exactly the same state. (Or if you prefer, one box containing a particle, prepared in the same way a large number of times.)
You may be wondering how we do that. I'm not going to tell you, largely because I have no idea. But I can tell you what quantum mechanics predicts about the measurements we make on the particles inside. You may think this is unfair, but it's no less fair than using classical mechanics to tell you what happens to a ball thrown at 20 meters per second at an upward angle of 30 degrees from a cliff edge 100 meters above sea level: classical mechanics doesn't tell us how to arrange that situation either, only what the consequences are.
Now we measure the \(x\)-coordinate of the position of each particle. It could be that every single box is guaranteed to give the same value for this measurement. In this case we say that the particle is in an eigenstate of \(x\)-position, and the eigenvalue is the \(x\) value that we measure.
But this is not a very likely occurrence. It is much more likely that the measurements result in different values. Then we can compute the average value of all the measurements, and if we have the enthusiasm and stamina, we can compute the standard deviation. Doing this lots of times with differently prepared particles, we find that sometimes the standard deviation is large, and sometimes it is small: so sometimes the particle has a relatively well-defined position (in the sense that the measurements tend to be close together), and sometimes not.
Instead of measuring the \(x\)-coordinate of the position of each particle, we might measure the \(x\)-component of its momentum. A similar situation results. Depending on the initial preparation, we sometimes find a large spread of results, and sometimes a small spread.
There is an obvious question we might ask: what is the relationship between the measurements of position and momentum? The answer is simple to describe, but not quite so easy to explain.
It turns out that we can prepare the particles so that they have a very small standard deviation in their position measurements. We can also prepare them so that they have a very small standard deviation in their momentum measurements. But we can't prepare them in a way where both sets of measurements have a small standard deviation. In fact, the product of the two standard deviations cannot be reduced below a certain value. The more tightly clustered the position measurements are, the more spread out are the momentum measurements; conversely, if the momentum measurements are tightly clustered, the position ones are more spread out.
This is not easy to understand from the point of view that the act of measurement disturbs the system. All the measurement are being carried out on particles in the same state, but some measure position and others measure momentum. So what is going on here?

What is a particle's state?

This is one way in which quantum mechanics is really different from classical mechanics.
In classical mechanics, the state of a particle is described completely by its position \(\pmb{x}\) and momentum \(\pmb{p}\). Given these quantities and the knowledge of the forces acting on the particle, the subsequent values of position and momentum are entirely determined.
The quantum mechanics picture is rather different.

Observables

In quantum mechanics, the state of a system is described by a vector, often denoted \(\psi\). So far, this does not sound so different from the classical mechanics situation: we can think of the position and momentum as a vector too. But it's a different kind of vector.
In the quantum mechanics picture of the universe, each quantity that we can observe is associated to a linear operator (in fact, a special kind of linear operator, called Hermitian, but I won't go into that level of detail here) I'll use \(H\) to represent a (Hermitian) linear operator. What this means is that \(H\) is a machine that I can feed a vector and it returns a vector in a well-behaved way: if \(\alpha, \beta\) are any two complex numbers, and \(\psi,\phi\) are any two vectors, then \[ H(\alpha \psi + \beta \phi) = \alpha H(\psi) + \beta H(\phi) \]
Although I slid past it fairly quickly just then, you may have noticed that I said \(\alpha\) and \(\beta\) could be complex numbers: this is one of the ways that quantum mechanics is different from classical mechanics---the mathematics of complex numbers is there at the fundamental level of the kind of vectors that can be used to describe the state of a system.
Now, if we choose our state vector very carefully, we can have the situation \[ H(\psi) = \lambda \psi \] where \(\lambda\) is a real number. When this happens, \(\psi\) is called an eigenvector of \(H\), and \(\lambda\) is the associated eigenvalue. (For general linear operators the eigenvalues can be complex, but for Hermitian operators they must be real.)
Eigenvectors are very important in this story. I'll explain why shortly, but first, we need to know this: If \(H\) is the linear operator corresponding to an observable, then there are eigenvectors \(\psi_i\) of \(H\) such that any state vector \(\psi\) can be expressed in the form \[ \psi = \sum_{i} \alpha_i \psi_i \] in just one way, where \(\sum_i |\alpha_i|^2 = 1\). (We call this set \(\{\psi_i\}\) the basis of eigenvectors associated with \(H\)).

Observations

'Just what,' you should be wondering by now, 'does this have to do with measuring a property of a particle?'
Well, given an observable, \(H\), (note that I'm abusing terminology a bit here by referring to the linear operator as the observable: I do that) and its associated basis, \(\{\psi_i\}\), we know that each \[ H (\psi_i) = \lambda_i \psi_i \] for some real number \(\lambda_i\).
Then if we have a particle in the state \[ \psi = \sum_{i} \alpha_i \psi_i \] We can carry out a measurement of the observable corresponding to \(H\).
The result of the measurement must be one of the values \(\lambda_i\), and a result of \(\lambda_i\) occurs with probability \(|\alpha_i|^2\). (So we don't know just what the result will be, just the probabilities of the various possible results.) Furthermore, immediately after the measurement, if the measured value is \(\lambda_K\), then the state of the system is \(\psi_K\). (This is called the collapse of the wave function.)
Quantum mechanics does not tell us which outcome will occur, it only tells us the probabilities of the various outcomes.
So for example, if the state of the system were \(\psi = 0.6 \psi_1 + 0.8 \psi_2\), then the result of measuring \(H\) would be \(\lambda_1\) \(0.36\) of the time (and immediately after these measurements the system would be in state \(\psi_1\)), and it would be \(\lambda_2\) the remaining \(0.64\) of the time (and immediately after those measurements the system would be in state \(\psi_2\)).
On the other hand, if the state of the system were \(\psi=\psi_1\), then the result of measuring \(H\) would always be \(\lambda_1\), and the state of the system would be unchanged by the measurement. (This, of course, contradicts the claim that it is impossible to make a measurement without disturbing the system.)
So we can see that (given this model of particle states and measurements), the average and spread of measurements is determined by the coefficients \(\alpha_i\); if most of the weight is associated with a particular basis vector, then the results have a very small spread, and if many basis vectors make a significant contribution, then the results are more spread out. |

Uncertainty

So now we can try to see what this might suggest about measurements of position and momentum. Let's call the operator corresponding to measurement of \(x\)-position \(X\), with corresponding basis \(\{\psi_i\}\) and the operator for the momentum measurement, \(P\), with corresponding basis \(\{\phi_i\}\).
The crux of the situation is that the \(\psi_i\) cannot be eigenvectors of \(P\), and the \(\phi_i\) cannot be eigenvectors of \(X\).
So if a particle is prepared in a state which guarantees a particular value of \(x\), so \(\psi = \psi_i\), then it must be a combination of \(\phi_i\); likewise, if \(\psi=\phi_i\), it must be a combination of \(\psi_i\). So certainty of position means that momentum is uncertain, and vice versa.

What's going on here?

The problem is that the operators \(P\) and \(X\) do not commute: if \(\psi\) is a state vector, then \(PX(\psi) \neq XP(\psi)\). The mathematics of Hermitian operators tells us that it is possible to find a basis of vectors which are eigenvectors for a pair of \(H_1\) and \(H_2\) if and only if the two commute, i.e. \[ H_1H_2 (\psi) = H_2H_1(\psi) \] for all vectors \(psi\).
We can define the operator \([H_1,H_2]=H_1H_2 - H_2H_1\), called the commutator of \(H_1\) and \(H_2\), and then we can equivalently say that \(H_1\) and \(H_2\) commute if \[ [H_1,H_2] = 0 \] (meaning that \([H_1H_2](\psi =0) \) for all \(\psi\).
It then follows that since \(P\) and \(X\) do not commute, one cannot have simultaneously sharply defined values for both the \(x\)-position and the \(x\)-component of momentum for a particle.
On the other hand, if you find two observable which do commute, then it is possible to have sharply defined values for both; indeed, either both will be sharply defined or neither will!

Heisenberg's uncertainty relation

Heisenberg's uncertainty relation (as we now understand it) tells us how the extent to which two operators fail to commute tells the extent to which values of the corresponding observable cannot be simultaneously well-defined. More precisely, given a state \(\psi\), and observables \(A\), and \(B\) the product of the standard deviations of measurements of \(A\) and \(B\) must be at least half the length of the vector \([H_1,H_2](\psi)\).
In the case of \(x\)-coordinate of position and \(x\)-component of momentum, the length of \([X,P](\psi)\) is always \(h/2\pi\), given the well-known uncertainty relation.

Just a minute \(\ldots\)

We've actually shifted ground in a fairly radical way here. We've gone from 'you can't measure a particle's position and momentum simultaneously' which comes from a world-view in which it has a position and momentum, but we can't access the values', to 'a particle's state does not simultaneously specify a sharp position and momentum' which comes from a world-view where the state of a particle just isn't the same information as it is in classical physics.
This, together with the fact that the theory is probabilistic rather than deterministic has been a significant issue in the development of quantum mechanics. The desire to maintain a 'realist', deterministic picture of physics, in which particles do have well-defined position and momentum even if we can't know them survives, and even now has its proponents.
And all this doesn't actually explain the uncertainty principle. It just provides a mathematical/theoretical framework which it is a consequence of. But that's OK, because that's generally how physics proceeds: we have phenomena that are hard to understand, and we develop a structure in which those phenomena are natural. Once we've become thoroughly accustomed to that structure, we tell ourselves we understand what's going on. At least, until the next Big Idea comes along and we start all over again.

Shiny Pebbles and other stuff

Friday, 29 December 2017

Measuring up to Heisenberg (and since)