Sunday, March 30, 2014

A very dangerous factory

Suppose a new type of bomb is invented whose detonation device is so incredibly sensitive that if it comes into contact with a single particle it will explode. Putting aside the impracticality of such a weapon (and the obvious factory OH&S issues), the producer wishes to maintain quality control as, with anything, some bombs will be faulty and not have detonation devices attached. The question immediately arises: Is it possible to have some ensemble of bombs which we can guarantee contains no faulty weapons?

This question is known as the Elitzur-Vaidman bomb-testing problem, and although one can arrive after reasonably little thought at the fairly obvious answer that no such ensemble is possible (as any direct observation using light or matter will detonate any working bombs), in actual fact such an ensemble is possible! How can this be the case? The short answer is: quantum effects. The long answer? Read on!

Figure 1: A Mach-Zehnder interferometer with faulty bomb $B$ in place and all branches labelled. Note that the $d$-branch is drawn only for illustrative purposes; the photon cannot be detected along the $d$-branch due to destructive interference (see equation \ref{eq:MZ}). (1) The single-photon source $S$. (2) One of the the two 50:50 beam-splitters which are both assumed to be lossless. (3) One of the two mirrors which are assumed to be perfectly reflective. (4) One of the two detectors which are assumed to be perfect detectors. 

The solution to this problem involves the use of a Mach-Zehnder interferometer (Fig. 1) with a single-photon source. To see how, let's consider the case of the interferometer without any bomb in place. We then have
\begin{align}\label{eq:MZ}
\left|s\right\rangle &\rightarrow \frac{i}{\sqrt{2}}\left|u\right\rangle + \frac{1}{\sqrt{2}}\left|v\right\rangle \nonumber \\
&\rightarrow \frac{i}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}\left|c\right\rangle + \frac{i}{\sqrt{2}}\left|d\right\rangle\right) + \frac{1}{\sqrt{2}}\left(\frac{i}{\sqrt{2}}\left|c\right\rangle + \frac{1}{\sqrt{2}}\left|d\right\rangle\right) \nonumber \\
&= \frac{i}{2}\left|c\right\rangle + \frac{-1}{2}\left|d\right\rangle + \frac{i}{2}\left|c\right\rangle + \frac{1}{2}\left|d\right\rangle \nonumber \\
&= i\left|c\right\rangle,
\end{align}
where $\left|a\right\rangle$ represents the quantum state in the $a$-branch of the interferometer (as labelled in Fig. 1) and $i$ is the imaginary unit.$^{1}$ What the above calculation shows$^{2}$ is that (somewhat surprisingly) despite the branching at the second beam-splitter, destructive interference along $d$ and constructive interference along $c$ causes the photon to always be detected at $C$ and never at $D$ (for this alignment).

Figure 2: A Mach-Zehnder interferometer with working bomb $B$ in place and all branches labelled. Note that $B$ blocks the $u$-branch whether the photon interacts with the detector or not (the case of an interaction is not illustrated here as this would correspond to the detonation of the bomb).

Now let's consider the same Mach-Zehnder interferometer but with a bomb placed such that the detector will be along the $u$-branch (as shown in Fig. 2). In this case we have
\begin{align}\label{eq:bomb}
\left|s\right\rangle\left|B_0\right\rangle &\rightarrow \frac{i}{\sqrt{2}}\left|u\right\rangle\left|B_0\right\rangle + \frac{1}{\sqrt{2}}\left|v\right\rangle\left|B_0\right\rangle \nonumber \\
&\rightarrow \frac{i}{\sqrt{2}}\left|X\right\rangle + \frac{1}{\sqrt{2}}\left|v\right\rangle\left|B_0\right\rangle \nonumber \\
&\rightarrow \frac{i}{\sqrt{2}}\left|X\right\rangle + \frac{1}{\sqrt{2}}\left(\frac{i}{\sqrt{2}}\left|c\right\rangle + \frac{1}{\sqrt{2}}\left|d\right\rangle\right)\left|B_0\right\rangle \nonumber \\
&= \frac{i}{\sqrt{2}}\left|X\right\rangle +\frac{i}{2}\left|c\right\rangle\left|B_0\right\rangle + \frac{1}{2}\left|d\right\rangle\left|B_0\right\rangle,
\end{align}
where $\left|B_0\right\rangle$ is the 'primed' or unexploded bomb, $\left|X\right\rangle$ represents the state where the bomb has been detonated$^3$ and $\left|a\right\rangle\left|b\right\rangle\equiv\left|a\right\rangle\otimes\left|b\right\rangle$. Note that for the purposes of this thought experiment we are assuming the detonator is a perfect detector, i.e., the photon wave cannot travel down $u$ without being absorbed.

As is clear from equation \ref{eq:bomb}, the inclusion of the detonator destroys the constructive/destructive interference that caused the simplification in equation \ref{eq:MZ}. Therefore, in the detonator case, rather than having every photon detected at $C$, we have the photon detected at $C$ with a probability of $1/4$, detected at $D$ with a probability of $1/4$ and the bomb detonated with a probability of $1/2$.$^4$

This is what makes it possible to assemble a set of functional bombs without detonating them—if a photon is detected by $D$ then the bomb must have a detonator attached and so we can set it aside knowing it works. If a photon is detected by $C$ then the functionality is indeterminate as we expect a detection at $C$ with non-zero probability in both detonator and no-detonator cases, but this is not a problem as we can simply emit another photon and re-run the test.

Note that while the probabilities above can be derived (in a fairly straightforward manner) from classical principles, we cannot apply a classical interpretation here as the quantum nature of the experiment is indispensable. In the classical (many-photon) run it is possible to both detonate a bomb and make a detection at $D$; this is precluded in the quantum case as the single photon cannot be absorbed by multiple objects. Furthermore, it is the wave-nature of the photon that permits the destructive interference at $D$ in the no-detonator case and thus provides 'detection by $D$' to signify the presence of the detonator and thus successfully make an 'interaction-free' measurement.

If you're unconvinced of this argument because it is based on a purely theoretical consideration, consider that this thought experiment has (equivalently) been carried out in the real world (admittedly using an ordinary detector rather than a bomb) and in fact was first done about a year after this problem was first published. I can't speak to the practical applications, if any exist, but I love this problem regardless for the simple fact that the solution challenges your intuition but can be understood using reasonably straightforward quantum mechanical principles.

Notes

$1$. The inclusion of $i$ in these equations might seem unusual or arbitrary, so I will provide a derivation here that shows where it comes from.

Figure 3: A beam-splitter with two incoming beams ($\psi_1$ and $\psi_2$) and two outgoing beams ($\psi_3$ and $\psi_4$). The incoming and outgoing beams are related by the beam-splitter matrix for the beam-splitter in question, as shown in equation \ref{eq:BSM}. In the note below, the beam-splitter will be assumed to be 50:50 in accordance with the calculations in the main text.

Consider a beam-splitter as shown in Fig. 3. This system can be represented by the matrix equation $\left|\psi_3,\psi_4\right\rangle = \hat{B}\left|\psi_1,\psi_2\right\rangle$, or explicitly,
\begin{equation}\label{eq:BSM}
\begin{pmatrix}
\psi_3 \\ \psi_4
\end{pmatrix}
=
\begin{pmatrix}
T & R \\ R & T
\end{pmatrix}
\begin{pmatrix}
\psi_1 \\ \psi_2
\end{pmatrix},
\end{equation}
where $T$ and $R$ are the transmission and reflection coefficients respectively. In the experiment we assume an ideal, lossless beam-splitter which demands that the beam-splitter matrix be unitary, i.e., $\hat{B}^{\dagger}\hat{B}=\hat{\mathbb{I}}$, or,
\begin{equation}\label{eq:unitary}
\begin{pmatrix}
T^{\ast} & R^{\ast} \\ R^{\ast} & T^{\ast}
\end{pmatrix}
\begin{pmatrix}
T & R \\ R & T
\end{pmatrix}
=
\begin{pmatrix}
1 & 0 \\ 0 & 1
\end{pmatrix}.
\end{equation}
Equation \ref{eq:unitary} immediately implies the following relations:
\begin{equation}
|T|^2+|R|^2=1,
\end{equation}
\begin{equation}\label{eq:0}
T^{\ast}R+R^{\ast}T=0.
\end{equation}
As $T$ and $R$ are complex numbers, we can represent them in polar form as $T=|T|e^{i\theta_T}$ and $R=|R|e^{i\theta_R}$. For simplicity we choose $\theta_T=0$ and thus $T=|T|\implies T^{\ast}=T$ and so equation \ref{eq:0} becomes
\begin{align}\label{eq:0new}
T|R|e^{i\theta_R}+|R|e^{-i\theta_R}T&=0 \nonumber \\
2T|R|\cos{\left(\theta_R\right)}&=0
\end{align}
where we have made use of the identity $\cos{(\alpha)}=e^{i\alpha}/2+e^{-i\alpha}/2$. Equation \ref{eq:0new} is satisfied by $\theta_R=n\pi+\pi/2, n\in\mathbb{Z}$, but we will choose $n=0\implies\theta_R=\pi/2$ for simplicity, which in turn gives $R=|R|e^{i\pi/2}=i|R|$.
Finally, as the beam-splitter is 50:50 (50% transmission, 50% reflection) we demand $|T|=|R|=1/\sqrt{2}$ and so the beam-splitter matrix is given by
\begin{equation}\label{eq:B}
\hat{B}=\frac{1}{\sqrt{2}}
\begin{pmatrix}
1 & i \\ i & 1
\end{pmatrix}.
\end{equation}
It should be clear that equation \ref{eq:B} is not a unique representation of $\hat{B}$; another choice of $\theta_T$ and/or $\theta_R$ would yield a different (unitary) matrix that would make no difference to the calculations shown in equations \ref{eq:MZ} and \ref{eq:bomb} (I leave proof of this as an exercise for the interested reader). With that said, the reason I like this representation is that it allows $i$ to function as a label for the states that result from a beam-splitter reflection, making it easier to write down interferometer equations directly from the diagram and keep track of where each term comes from. This is, of course, purely a matter of personal preference.

$2$. This equation is an example of quantum superposition in action. For example, the first line says that the photon exists in a superposition of the $\left|u\right\rangle$ and $\left|v\right\rangle$ states where the states are equally weighted (as we are assuming normalisation). Superposition is a fundamental aspect of quantum mechanics that follows from the linearity of the Schrödinger equation (linear combinations of solutions will themselves be solutions). In this case, the beam-splitter splits the photon probability wave along the two channels and so in some sense the photon travels along both branches, although no measurement can be made which will detect the photon in both channels at once—this is not a consequence of experimental limitations but is a restriction that is fundamental to quantum theory. The question of why this is the case is a deep and ongoing one, and I encourage the interested reader to investigate the literature on the philosophy (and especially interpretations) of quantum mechanics.

$3$. I have gone to some pains in this post to avoid using the term "wavefunction collapse" at any point, although for clarity I will say will say that in the Copenhagen interpretation, the case of the photon interacting with the detonator (or any of the detectors for that matter) is an example of wavefunction collapse.

$4$. So long as the beam-splitters are both 50:50, as we have assumed throughout this blog post. Naturally, other types of beam-splitters will yield different results, and in fact using a more sophisticated apparatus will permit a much better detection level (in theory, the detection fraction can be brought arbitrarily close to 1, although I cannot speak to the practicality of such an apparatus).

Thursday, March 20, 2014

News (2014/03/20)

I'm trying to post on my blog much more often this year than I used to, in fact as close to every week as I can manage. Unfortunately, the post I'm working on at the moment isn't nearly ready for publication, so this week I'm instead going to make a little news post, the first item of which will be the thing that I just told you (about the new blog post coming soon)!

The next item of news is not particularly new; last Friday (the 14th of March) was Pi Day. Pi Day is of course silly for a whole bunch of reasons (the main one being it's based on the completely nonsensical American dating system) so I've decided to introduce readers who may not be familiar with it to the concept of tau (τ). Tau has been proposed as an alternative to pi, and while I am not especially partisan on the matter I have to say I am somewhat sympathetic. Here is the case for tau laid out in the Tau Manifesto and for the sake of fairness a counterargument in the Pi Manifesto.

Finally, the real news comes in the form of the results of the recent BICEP2 measurements of B-mode polarisation in the CMB, easily the biggest news in physics since the Higgs was announced in 2012 and a major breakthrough for early-universe cosmologists. I will be able to do an explanatory post about the news if there's enough demand for one, but otherwise a lot of good explanations can be found around the place ranging from the somewhat simplistic to the slightly more technical. This is a very exciting time for fundamental physics and I expect to see some very interesting papers published in the next couple of years based on insights from this new data.

Finally, it isn't technically news, but if you haven't heard of them already, I strongly urge you to check out Brady Haran's science channels, especially Sixty Symbols (physics) and Numberphile (mathematics); I've subscribed to most of them on YouTube and they are absolutely fantastic.

That's all for this quick post, hopefully I'll have a considerably more in-depth number ready for next week! See you then!

Thursday, March 6, 2014

Playing with infinite series

Sequences, series and summation notation

In mathematical parlance, the term 'sequence' carries a similar meaning as in regular speech; it refers to an ordered list of numbers that usually follows a rule. For example, the sequence $1, 2, 3, 4, 5...$ (onwards without end) is given by adding 1 to the previous number in the sequence, beginning at 1. The definition of the term 'series' is less obvious—one description is that a series is the sum of a sequence, or in other words, to gain a series one adds all the terms of a sequence. For a finite series (a series with finitely many terms, or alternatively a series with a last term), the mathematics is typically fairly straightforward and so we will focus instead on the topic of infinite series, but before we do so, I will make a brief digression to discuss notation.

For reasons I am sympathetic to, few people enjoy reading about mathematical notation and it is difficult to write about it in a way which is interesting. However, while it is possible to write a series as, for example, $1+2+3+4+5+...$ this soon becomes cumbersome and is very limiting for finite series with a large number of terms and for series for which the rule or pattern is not obvious from the first few terms. For these reasons, mathematicians have developed a notation for series which I will adopt in the remainder of this post for instructional purposes (side–by–side with the long-hand version for clarity).

The notation is referred to as summation notation or sigma notation and uses a large capital sigma $\Sigma$ to denote the summation. The rule is written to the right of the sigma in terms of an index of summation, the lowest value of the index will be under the sigma and the highest value will be above the sigma (or '$\infty$' for an infinite series with no final term).$^1$

A simple example of sigma notation in action is the finite series
\begin{align} \label{eq:square5}
\sum_{n=1}^{5}n^2 & = 1^2+2^2+3^2+4^2+5^2 \nonumber \\
& = 1+4+9+16+25,
\end{align}
which in this case is equal to $55$. Here, the index of summation $n$ appears prominently in the rule as the number that is being squared. Another example, slightly tricker this time, is the infinite series
\begin{align} \label{eq:Zeno} 
\sum_{n=1}^{\infty}\left(\frac{1}{2}\right)^n
& = \left(\frac{1}{2}\right)^1+\left(\frac{1}{2}\right)^2+\left(\frac{1}{2}\right)^3+\left(\frac{1}{2}\right)^4+... \nonumber \\ & = \frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\frac{1}{16}+...
\end{align}
where the index of summation is this time the power that $1/2$ is raised to in each term of the sum.$^2$

Infinite series behaving badly

An infinite series is in many ways a different beast to a finite series. Possibly the clearest way is conceptually; finite series are computed fairly easily, in principle at least—it is simply a case of adding so many numbers together and then reading the value off your calculator. This is not possible with an infinite series, as there is no final term and no opportunity to hit a final '$=$' button on the calculator, as the sum goes on forever. This is not always a problem, however. Some series are known as 'convergent', which is to say that they are in effect equal or equivalent to some number. There are many tests for convergence, but suffice it to say that we know a series is convergent when it approaches said number.$^3$ We have already seen a convergent series in equation \ref{eq:Zeno}, but another example of a convergent series is
\begin{equation}
\label{eq:euler}
e=\sum^{\infty}_{n=0}\frac{1}{n!}=1+1+\frac{1}{2}+\frac{1}{6}+...
\end{equation}
where $!$ is the factorial operator.$^4$ As you can see, this series converges to $e$, a truly marvellous number which happens to be my favourite mathematical constant, but an explanation of that fact would require another post entirely.

If a series is not convergent, however, then we call it divergent,$^5$ and that is where the trouble starts. A straightforward example is the harmonic series
\begin{equation}
\label{eq:harmonic}
\sum^{\infty}_{n=1}\frac{1}{n}=1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+...
\end{equation}
which despite its similarity to equation \ref{eq:Zeno} increases indefinitely and does not approach any particular number. This is in some sense easy to understand intuitively; one can see how the sum would just keep getting bigger and bigger. This kind of reasoning is based around partial sums, which are truncations of infinite series. For the harmonic series the partial sum is given by
\begin{equation} \label{eq:Hn}
H_n=\sum^{n}_{k=1}\frac{1}{k}=1+\frac{1}{2}+\frac{1}{3}+...+\frac{1}{n},
\end{equation}
where the $n^{\text{th}}$ partial sum $H_n$ is known as the $n^{\text{th}}$ harmonic number. (Note that here we use $n$ to label the partial sum and so the role of the index of summation is taken over by $k$ to avoid confusion, although we could have chosen $k$ as our label and kept $n$ the index of summation if we wished; it makes no difference). Partial sums are finite series by design and so just like equation \ref{eq:square5} there is a final term after which we can hit the metaphorical '$=$' button. If we do so after 2 terms we find $H_2=3/2$, after 3 terms $H_3=11/6$, 4 terms $H_4=25/12$ and so on.$^6$

Now let us consider an altogether different beast known as Grandi's series. Grandi's series is given as
\begin{equation} \label{eq:Grandi}
\sum^{\infty}_{n=0}(-1)^n=1-1+1-1+1-1...
\end{equation}
Unlike the harmonic series, the partial sums of Grandi's series do not seem to trend in any particular direction, but rather alternate between two 'accumulation points' at 1 and 0. This peculiarity will cause us some trouble, but to see why first we will make a brief foray into the physical sciences.

A brief foray into the physical sciences

Suppose we have two thin neutral conductive plates placed parallel to each other very close together in a vacuum. According to classical physics (and everyday intuition) absolutely nothing will happen. However, this is not what we observe; in fact, the two plates will experience an attractive force attempting to bring them together. This is known as the Casimir effect, and can only be understood in terms of quantum mechanics. The Casimir effect is typically expressed in terms of quantum electrodynamics, but there is nothing inherently electrodynamic about the effect and so one can consider many analogous scenarios with equivalent Casimir effects. In order to greatly simplify our derivation I will choose to do just that.

One thing that it is essential to be aware of is that in quantum mechanics a vacuum is not 'empty' in the sense that there is nothing at all there; so far as we understand, such an emptiness cannot exist in the physical universe. This possibility is precluded both on experimental grounds and on theoretical grounds by the uncertainty principle.$^7$ Instead we understand the universe to be filled with quantised fields (mathematically speaking, a field assigns some value(s) to every point in space(time); a quantised field is one where the range of possible values is restricted to some discrete set) with each particle being a localised excitation in the energy of the field, e.g., photons (light particles) are excitations in the electromagnetic field, electrons are excitations in the electron field and so on. The rough procedure$^8$ for quantising a field is to treat it as a quantum harmonic oscillator (QHO) at every point in space—one could crudely picture this as an infinite system of connected balls and springs—which naturally results in quantised, discrete energy levels. A 'true' vacuum is therefore the ground state (lowest energy state) in every quantum field. We know that the ground state energy cannot be 0 as that would be the kind of emptiness that cannot exist. Rather, it takes on a value of
\begin{equation} \label{eq:energy}
E_0=\frac{\hbar\omega}{2},
\end{equation}
the ground state of the QHO, where $\hbar$ is the reduced Planck constant and $\omega$ is the angular frequency of the oscillator.

With all this in mind, let's return to the Casimir effect, considering a 1+1-dimensional massless scalar field to simplify and clarify the example. The plates impose what are called 'boundary conditions', they restrict the frequencies that waves in between the plates can take, in this case to standing waves.$^9$ The equation for a standing wave in 1+1-dimensions is
\begin{equation} \label{eq:wave}
\psi_n(x,t)=e^{-i\omega_nt}\sin{\left(\frac{n\pi x}{a}\right)}
\end{equation}
 where $a$ is the width of the cavity between the plates, $n$ is a natural number (a positive whole number) and $\omega_n$ is the angular frequency given by
\begin{equation} \label{eq:omega}
\omega_n=\frac{n\pi c}{a}
\end{equation}
where $c$ is the wave speed. Now, we know that the ground state energy for a QHO is given by equation \ref{eq:energy} and we know that in between the plates only standing waves can exist, so we can only have waves with angular frequency $\omega_n=n\pi c/a$. If we wish to find the vacuum energy between the plates, it seems clear then that all we need to do is sum over the possible ground state energies, giving
\begin{equation} \label{eq:vacuum}
E=\frac{\hbar}{2}\sum^{\infty}_{n=1}\omega_n=\frac{\hbar\pi c}{2a}\sum^{\infty}_{n=1}n.
\end{equation}
Here we run into a problem. Unless you haven't been paying attention, you'll notice that
\begin{equation} \label{eq:natural}
\sum^{\infty}_{n=1}n=1+2+3+4+...
\end{equation}
absolutely positively does not converge at all, and yet here it appears in a physics equation relating to a very real and decidedly measurably finite effect. How can this be? We haven't made a mistake, but we have overlooked one crucial fact. No matter what our plates are made of, they cannot confine arbitrarily high energies of the field; those high energy modes will always be able to escape. So what we need now is to somehow take account of that fact and in doing so somehow assign a finite value to equation \ref{eq:vacuum} and rescue our derivation.

Putting divergent series to work

So what we seek is a way of attaching a meaningful finite value to divergent equations. Let's take a look at Grandi's series again and see if we can come up with anything consistent. We can try cancelling off pairs of terms to give
\begin{align} \label{eq:gpair1}
\sum^{\infty}_{n=0}(-1)^n&=(1-1)+(1-1)+(1-1)+...\nonumber\\
&=0+0+0+...\nonumber\\
&=0
\end{align}
but we can just as easily choose different pairings to give
\begin{align} \label{eq:gpair2}
\sum^{\infty}_{n=0}(-1)^n&=1+(-1+1)+(-1+1)+...\nonumber\\
&=1+0+0+...\nonumber\\
&=1
\end{align}
which is certainly not consistent. We can try re-ordering the series to bring all the $+1$s to the front, but this gives
\begin{align} \label{eq:inf+1}
\sum^{\infty}_{n=0}(-1)^n=1+1+1+...-1-1-1...
\end{align}
As there are an infinite number of $+1$s we never reach the $-1$s and the series approaches $+\infty$. Trying the same process by arranging all the $-1$s to the front will in the same way cause the series to approach $-\infty$. Rather than find a single consistent way of assigning a number to the series, all we have found is four duds.

The reason these methods are all duds is because operations like reordering and cancelling pairs of terms (method of differences) are valid only for convergent series. If we try to apply them to divergent series, the result is clearly a mess. Let's step back and look at the problem from another angle. We want to assign a number to the series; presumably we should be able to manipulate that number algebraically. If we call the series $S$ then after some algebraic juggling we find
\begin{align} \label{eq:S}
S&=1-1+1-1+1-1+...\nonumber\\
1-S&=1-(1-1+1-1+1-1+...)\nonumber\\
&=1-1+1-1+1-1+...\nonumber\\
&=S\nonumber\\
1&=2S\nonumber\\
\Rightarrow S&=1/2
\end{align}
Furthermore, we can consider Grandi's series as an example of the infinite geometric series
\begin{align} \label{eq:geometricG}
\sum_{k=0}^{\infty}ar^k=a+ar+ar^2+ar^3...
\end{align}
where $a=1$ and $r=-1$. Even though Grandi's series is divergent, equation \ref{eq:geometricG} is convergent for $|r|<1$ and in that case
\begin{align} \label{eq:geometricC}
\sum_{k=0}^{\infty}ar^k=\frac{a}{1-r}.
\end{align}
If we substitute $a=1$ and $r=-1$ into equation \ref{eq:geometricC} then we again find $S=1/2$. Neither of these constitute solid proof in and of themselves, but they are highly suggestive. The tool we are looking for is the Cesàro sum.$^{10}$ A series is Cesàro summable when the mean value of its partial sums tends to a given value. For convergent series the Cesàro sum will always equal the number the series converges to and the Cesàro sum is defined for many divergent series too, including Grandi's series. The partial sums of Grandi's series are $1,0,1,0,...$ and so the terms in the Cesàro sequence are $1, 1/2, 2/3,1/2,3/5,1/2,4/7,...$ which clearly converges to $1/2$ in the limit.$^{11}$ In some sense this is quite a satisfying result as our Cesàro sum lies exactly in between the two accumulation points, serving as a kind of average value.

Having fun with zeta function regularisation

The partial sums of equation \ref{eq:natural} are the triangular numbers $T_n$ (so named because they give the numbers of objects that can be arranged into equilateral triangles) $1, 3, 6, 10, 15,...$, so we calculate the Cesàro sequence of equation \ref{eq:natural} and find it goes $1, 2, 10/3, 5, 7,...$  Once again, just as we stop to bask in our moment of triumph, we find our job isn't quite yet done; equation \ref{eq:natural} is not Cesàro summable. We must find another, more sophisticated method for attaching a number to that series.

Let us consider the series
\begin{align} \label{eq:dirichlet}
D(s)=\sum^{\infty}_{n=1}n^{-s}, \text{ Re}(s)>1
\end{align}
where $s$ is a complex number and $\text{Re}(s)$ denotes the real part of $s$. For $s=-1$ this series would be exactly the same as equation \ref{eq:natural}, but the series is not defined for $s=-1$.$^{12}$ However, we saw back in equation \ref{eq:geometricC} that on that occasion if we applied the convergent case equation to Grandi's series we got the result of $1/2$ that turned out to be right; perhaps we could do something similar here? As it would happen, we can, but first I would implore you to, in the great words of John Arnold, "Hold on to your butts".

In the domain $\text{Re}(s)>1$, $\zeta(s)=D(s)$ where $\zeta(s)$ is known as the Riemann zeta function.$^{13}$ Unlike $D(s)$, $\zeta(s)$ is defined over the entire complex plane and is known as an analytic continuation of $D(s)$. Analytic continuation is a wonderfully useful (and perplexing) tool of complex analysis whereby the domain of an analytic ('well-behaved') function can be extended. While this may not seem like a big deal, it raises the question as to whether or not a function can be continued arbitrarily; if our original function is only defined for some small domain and we wish to extend that domain, what is to stop us giving it such–and–such value in the extended domain instead of some other value? As it would happen, the identity theorem states (very roughly) that any two holomorphic functions (all complex analytic functions are holomorphic) that are equal to each other at some point in a given domain must be equal over the entire domain, and thus there is only one unique way to analytically continue a function.

We wish to know what $\zeta(-1)$ is so we can assign that value to $D(-1)$ through the magic of analytic continuation. For negative whole numbers $n<0$,
\begin{equation} \label{eq:negzeta} \zeta(n)=-\frac{B_{1-n}}{1-n}
\end{equation}
where $B_{1-n}$ is the '$1-n$'$^{\text{th}}$ Bernoulli number.$^{14}$ In the case of $n=-1$ we have
 \begin{equation} \label{eq:-1zeta}
\zeta(-1)=-\frac{B_2}{2}=-\frac{1}{12}.
\end{equation}
Thus we can assign to the divergent series $1+2+3+4+...$ the value of $-1/12$. If this strikes you as strange or even suspicious then I applaud your scepticism; there is indeed something plainly odd about assigning the value of a small, fractional negative number to a series of ever-increasing positive whole numbers. This is not at all like the neat case of Grandi's series where we had our value lying neatly between the accumulation points. But before we throw our hands up in despair, recall our motivation for this investigation, the Casimir effect. What happens if we use our value of $-1/12$ there?

As it would happen, to do so is to use a technique known as zeta function regularisation. We replace a divergent series with a 'regulator' in the form of a zeta function (although other regulators exist, each with different strengths and weaknesses) and in doing so remove unphysical infinities from our theory. If we have regularised correctly, then by the time we have reached our final result, the regulator will have disappeared—it is nothing more than a 'trick' for calculating the correct value and so it should not still appear at the last step.
\begin{equation} \label{eq:zetanorm}
E=\frac{\hbar\pi c}{2a}\sum^{\infty}_{n=1}n=\frac{\hbar\pi c}{2a}\zeta(-1)=-\frac{\hbar\pi c}{24a}.
\end{equation}
The force between the two plates is given by the negative gradient of the energy:
\begin{equation} \label{eq:force}
F=-\frac{\partial E}{\partial a}=-\frac{\partial}{\partial a}\left(-\frac{\hbar\pi c}{24a}\right)=-\frac{\hbar\pi c}{24a^2}
\end{equation}
which, lo and behold, is exactly the right result.

Final thoughts

The prompt for my writing this lengthy explanation of infinite series, divergences, regularisation and so on was a minor flurry on the Internet a little while ago due to a somewhat dodgy derivation of the result $1+2+3+4+...=-1/12$. In order to get this result a number of those forbidden–for–divergent–series operations were used for simplicity's sake, but in doing so I felt the important subtlety between a convergent series equalling a number and a divergent series being assigned a value (albeit in a rigorous way) was lost, and that is an important distinction to make. You cannot keep adding $1+2+3+4+...$ and then through the magic of infinity come up with a $-1/12$ at the end; that series will always diverge and will always approach $+\infty$, but as we have seen we can rigorously assign the value of $-1/12$ to it for the purposes of removing infinities from our calculations using, in this case, the Riemann zeta function.

During the conception of this post I did ponder a question which continues to interest me, though. We saw from the example of Grandi's series that there are some mathematical operations and manipulations which would be fine in 'normal' mathematics but which suddenly become verboten when done in the specific context of a divergent series. The question is, Is this a fundamental property of the mathematics in question, that is to say, the underlying patterns and structures, or is it one emergent from notational limits? Is a rearrangment of terms in a divergent series actually fundamentally different to a rearrangment of terms in a convergent series, or is it the same thing that manifests different results in different contexts? I am not sure if this question can be answered sensibly, but for my money I am reminded of the old dichotomy in the philosophy of mathematics whose question is still yet to find an answer: Is mathematics discovered or invented? Now there is truly some food for thought.

Notes

$1$. There are other ways to use sigma notation. For example, equation \ref{eq:natural} can also be represented as $\sum_{n\leq 1}n$ ($n$ being a whole number is implied by the discrete sum being used instead of the continuous integtral) or $\sum_{n\in\mathbb{N}}n$. Other less common examples of sigma notation are $\sum_{p\text{ prime}}\frac{1}{p}$ which is the sum of the reciprocals of all prime numbers and $\sum_{d \vert n}d^x$ which is the divisor function, where '$d \vert n$' means $d$ divides $n$ exactly.

$2$. The sum shown here has the interesting property of being a series representation of the number 1, or in the given notation, $\sum_{n=1}^{\infty}\left(\frac{1}{2}\right)^n=1$. For any readers who are passingly familiar with Ancient Greek philosophy, this fact can be viewed as a solution to Zeno's dichotomy paradox, albeit one which ignores some nuances which I will address at a later date when I cover supertasks, one of the many intersections of philosophy and mathematics.

$3$. Formally, there exists a limit $S$ such that for any (arbitrarily small) number $\epsilon>0$ there is a number $N$ such that for $n>N$, $|S_n-S|<\epsilon$ where $S_n$ is the $n^{\text{th}}$ partial sum. Informally, there exists a number $S$ such that for an arbitrarily large $n$ the partial sum $S_n$ will be arbitrarily close to $S$.

$4$. The factorial operator is defined as $n!=n(n-1)(n-2)...1$, or in words, $n!$ (read '$n$ factorial') is given by the multiplication of $n$ by all the whole numbers less than $n$ going down to $1$. As an example, $5!=5\cdot4\cdot3\cdot2\cdot1=120$.

$5$. I will not delve here into the depths of conditional and absolute convergence, almost convergence, and so on. Suffice it to say that there are a great many interesting infinite series that have a great many interesting properties relating to convergence behaviour other than those simple ones shown here. If you are especially interested in the topic, I recommend investigating the Riemann series theorem for a very interesting ad surprising property of conditionally convergent series.

$6$. The harmonic series is deceptively interesting and there are many, many different and varied ways of calculating the harmonic numbers. One example straight out of equation \ref{eq:harmonic} is the recurrence relation (an equation which gives one term in a sequence in terms of a previous one) $H_n=H_{n-1}+1/n$. I encourage you to investigate others and see where it leads you!

$7$. The value of a field and the value of the derivative of the field at a given point in space cannot be known to arbitrary accuracy; the better one is known the less well the other must be. Though the uncertainty principle is often raised as a weird and wonderful result of quantum mechanics, in fact it is a feature of any wave theory and is linked intimately with Fourier transformations, although a thorough demonstration of how this is so is sadly beyond the scope of this note.

$8$. The complexities of quantum field theory should by no means be underestimated; what I am presenting here is an extraordinarily simplified version that, while instructive, would not necessarily be very useful in practice.

$9$. Standing waves are waves with nodes (points of zero displacement) at the endpoints. An example would be a plucked guitar string, which is retricted from moving at the bridge and nut. This restriction ensures the only possible wavelengths are given by $\lambda_n=na/2$ for length $a$ where $n$ is a natural number. Using the wave relationship $c=\nu_n\lambda_n$ we find equivalently $\nu_n=nc/2a$ (or equation \ref{eq:omega} as $\omega=2\pi\nu$) which gives the frequency $\nu$ for the $n^{\text{th}}$ harmonic.

$10$. Or rather, one of the tools, as we could equally have chosen the Abel sum, the Borel sum, the $1/x$ series method, or a number of others. Cesàro summation is far from the only rigorous way of dealing with Grandi's series, but what is important is that it gives a value of 1/2, as do the methods I listed above—this consistency is a big hint that we have picked the right number to assign to the series.

$11$. It is worth noting that if we 'dilute' the series by adding in $+0$s we change the value of the Cesàro sum (although the summability is not affected). This illuminates yet another mathematical manipulation which would be fine for a convergent series but is not for a divergent series.

$12$. Precisely because it would become divergent.

$13$. If $e$ is my favourite number then $\zeta(s)$ is surely my favourite function, but in exactly the same way I could not possibly hope to explain why except in another post devoted to it exclusively.

$14$. For fear of overwhelming you with yet more beautiful mathematics in an already over-long post I will avoid the temptation of discussing the Bernoulli numbers, although as ever I encourage the interested reader to investigate for themselves!