Sequences, series and summation notation
In mathematical parlance, the term 'sequence' carries a similar meaning as in regular speech; it refers to an ordered list of numbers that usually follows a rule. For example, the sequence $1, 2, 3, 4, 5...$ (onwards without end) is given by adding 1 to the previous number in the sequence, beginning at 1. The definition of the term 'series' is less obvious—one description is that a series is the sum of a sequence, or in other words, to gain a series one adds all the terms of a sequence. For a finite series (a series with finitely many terms, or alternatively a series with a last term), the mathematics is typically fairly straightforward and so we will focus instead on the topic of infinite series, but before we do so, I will make a brief digression to discuss notation.
For reasons I am sympathetic to, few people enjoy reading about mathematical notation and it is difficult to write about it in a way which is interesting. However, while it is possible to write a series as, for example, $1+2+3+4+5+...$ this soon becomes cumbersome and is very limiting for finite series with a large number of terms and for series for which the rule or pattern is not obvious from the first few terms. For these reasons, mathematicians have developed a notation for series which I will adopt in the remainder of this post for instructional purposes (side–by–side with the long-hand version for clarity).
The notation is referred to as summation notation or sigma notation and uses a large capital sigma $\Sigma$ to denote the summation. The rule is written to the right of the sigma in terms of an index of summation, the lowest value of the index will be under the sigma and the highest value will be above the sigma (or '$\infty$' for an infinite series with no final term).$^1$
A simple example of sigma notation in action is the finite series
\begin{align} \label{eq:square5}
\sum_{n=1}^{5}n^2 & = 1^2+2^2+3^2+4^2+5^2 \nonumber \\
& = 1+4+9+16+25,
\end{align}
which in this case is equal to $55$. Here, the index of summation $n$ appears prominently in the rule as the number that is being squared. Another example, slightly tricker this time, is the infinite series
\begin{align} \label{eq:Zeno}
\sum_{n=1}^{\infty}\left(\frac{1}{2}\right)^n
& = \left(\frac{1}{2}\right)^1+\left(\frac{1}{2}\right)^2+\left(\frac{1}{2}\right)^3+\left(\frac{1}{2}\right)^4+... \nonumber \\
& = \frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\frac{1}{16}+...
\end{align}
where the index of summation is this time the power that $1/2$ is raised to in each term of the sum.$^2$
Infinite series behaving badly
An infinite series is in many ways a different beast to a finite series. Possibly the clearest way is conceptually; finite series are computed fairly easily, in principle at least—it is simply a case of adding so many numbers together and then reading the value off your calculator. This is not possible with an infinite series, as there is no final term and no opportunity to hit a final '$=$' button on the calculator, as the sum goes on forever. This is not always a problem, however. Some series are known as 'convergent', which is to say that they are in effect equal or equivalent to some number. There are many tests for convergence, but suffice it to say that we know a series is convergent when it approaches said number.$^3$ We have already seen a convergent series in equation \ref{eq:Zeno}, but another example of a convergent series is
\begin{equation}
\label{eq:euler}
e=\sum^{\infty}_{n=0}\frac{1}{n!}=1+1+\frac{1}{2}+\frac{1}{6}+...
\end{equation}
where $!$ is the factorial operator.$^4$ As you can see, this series converges to $e$, a truly marvellous number which happens to be my favourite mathematical constant, but an explanation of that fact would require another post entirely.
If a series is not convergent, however, then we call it divergent,$^5$ and that is where the trouble starts. A straightforward example is the harmonic series
\begin{equation}
\label{eq:harmonic}
\sum^{\infty}_{n=1}\frac{1}{n}=1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+...
\end{equation}
which despite its similarity to equation \ref{eq:Zeno} increases indefinitely and does not approach any particular number. This is in some sense easy to understand intuitively; one can see how the sum would just keep getting bigger and bigger. This kind of reasoning is based around partial sums, which are truncations of infinite series. For the harmonic series the partial sum is given by
\begin{equation} \label{eq:Hn}
H_n=\sum^{n}_{k=1}\frac{1}{k}=1+\frac{1}{2}+\frac{1}{3}+...+\frac{1}{n},
\end{equation}
where the $n^{\text{th}}$ partial sum $H_n$ is known as the $n^{\text{th}}$ harmonic number. (Note that here we use $n$ to label the partial sum and so the role of the index of summation is taken over by $k$ to avoid confusion, although we could have chosen $k$ as our label and kept $n$ the index of summation if we wished; it makes no difference). Partial sums are finite series by design and so just like equation \ref{eq:square5} there is a final term after which we can hit the metaphorical '$=$' button. If we do so after 2 terms we find $H_2=3/2$, after 3 terms $H_3=11/6$, 4 terms $H_4=25/12$ and so on.$^6$
Now let us consider an altogether different beast known as Grandi's series. Grandi's series is given as
\begin{equation} \label{eq:Grandi}
\sum^{\infty}_{n=0}(-1)^n=1-1+1-1+1-1...
\end{equation}
Unlike the harmonic series, the partial sums of Grandi's series do not seem to trend in any particular direction, but rather alternate between two 'accumulation points' at 1 and 0. This peculiarity will cause us some trouble, but to see why first we will make a brief foray into the physical sciences.
A brief foray into the physical sciences
Suppose we have two thin neutral conductive plates placed parallel to each other very close together in a vacuum. According to classical physics (and everyday intuition) absolutely nothing will happen. However, this is not what we observe; in fact, the two plates will experience an attractive force attempting to bring them together. This is known as the Casimir effect, and can only be understood in terms of quantum mechanics. The Casimir effect is typically expressed in terms of quantum electrodynamics, but there is nothing inherently electrodynamic about the effect and so one can consider many analogous scenarios with equivalent Casimir effects. In order to greatly simplify our derivation I will choose to do just that.
One thing that it is essential to be aware of is that in quantum mechanics a vacuum is not 'empty' in the sense that there is nothing at all there; so far as we understand, such an emptiness cannot exist in the physical universe. This possibility is precluded both on experimental grounds and on theoretical grounds by the uncertainty principle.$^7$ Instead we understand the universe to be filled with quantised fields (mathematically speaking, a field assigns some value(s) to every point in space(time); a quantised field is one where the range of possible values is restricted to some discrete set) with each particle being a localised excitation in the energy of the field, e.g., photons (light particles) are excitations in the electromagnetic field, electrons are excitations in the electron field and so on. The rough procedure$^8$ for quantising a field is to treat it as a quantum harmonic oscillator (QHO) at every point in space—one could crudely picture this as an infinite system of connected balls and springs—which naturally results in quantised, discrete energy levels. A 'true' vacuum is therefore the ground state (lowest energy state) in every quantum field. We know that the ground state energy cannot be 0 as that would be the kind of emptiness that cannot exist. Rather, it takes on a value of
\begin{equation} \label{eq:energy}
E_0=\frac{\hbar\omega}{2},
\end{equation}
the ground state of the QHO, where $\hbar$ is the reduced Planck constant and $\omega$ is the angular frequency of the oscillator.
With all this in mind, let's return to the Casimir effect, considering a 1+1-dimensional massless scalar field to simplify and clarify the example. The plates impose what are called 'boundary conditions', they restrict the frequencies that waves in between the plates can take, in this case to standing waves.$^9$ The equation for a standing wave in 1+1-dimensions is
\begin{equation} \label{eq:wave}
\psi_n(x,t)=e^{-i\omega_nt}\sin{\left(\frac{n\pi x}{a}\right)}
\end{equation}
where $a$ is the width of the cavity between the plates, $n$ is a natural number (a positive whole number) and $\omega_n$ is the angular frequency given by
\begin{equation} \label{eq:omega}
\omega_n=\frac{n\pi c}{a}
\end{equation}
where $c$ is the wave speed. Now, we know that the ground state energy for a QHO is given by equation \ref{eq:energy} and we know that in between the plates only standing waves can exist, so we can only have waves with angular frequency $\omega_n=n\pi c/a$. If we wish to find the vacuum energy between the plates, it seems clear then that all we need to do is sum over the possible ground state energies, giving
\begin{equation} \label{eq:vacuum}
E=\frac{\hbar}{2}\sum^{\infty}_{n=1}\omega_n=\frac{\hbar\pi c}{2a}\sum^{\infty}_{n=1}n.
\end{equation}
Here we run into a problem. Unless you haven't been paying attention, you'll notice that
\begin{equation} \label{eq:natural}
\sum^{\infty}_{n=1}n=1+2+3+4+...
\end{equation}
absolutely positively does not converge at all, and yet here it appears in a physics equation relating to a very real and decidedly measurably finite effect. How can this be? We haven't made a mistake, but we have overlooked one crucial fact. No matter what our plates are made of, they cannot confine arbitrarily high energies of the field; those high energy modes will always be able to escape. So what we need now is to somehow take account of that fact and in doing so somehow assign a finite value to equation \ref{eq:vacuum} and rescue our derivation.
Putting divergent series to work
So what we seek is a way of attaching a meaningful finite value to divergent equations. Let's take a look at Grandi's series again and see if we can come up with anything consistent. We can try cancelling off pairs of terms to give
\begin{align} \label{eq:gpair1}
\sum^{\infty}_{n=0}(-1)^n&=(1-1)+(1-1)+(1-1)+...\nonumber\\
&=0+0+0+...\nonumber\\
&=0
\end{align}
but we can just as easily choose different pairings to give
\begin{align} \label{eq:gpair2}
\sum^{\infty}_{n=0}(-1)^n&=1+(-1+1)+(-1+1)+...\nonumber\\
&=1+0+0+...\nonumber\\
&=1
\end{align}
which is certainly not consistent. We can try re-ordering the series to bring all the $+1$s to the front, but this gives
\begin{align} \label{eq:inf+1}
\sum^{\infty}_{n=0}(-1)^n=1+1+1+...-1-1-1...
\end{align}
As there are an infinite number of $+1$s we never reach the $-1$s and the series approaches $+\infty$. Trying the same process by arranging all the $-1$s to the front will in the same way cause the series to approach $-\infty$. Rather than find a single consistent way of assigning a number to the series, all we have found is four duds.
The reason these methods are all duds is because operations like reordering and cancelling pairs of terms (method of differences) are valid only for convergent series. If we try to apply them to divergent series, the result is clearly a mess. Let's step back and look at the problem from another angle. We want to assign a number to the series; presumably we should be able to manipulate that number algebraically. If we call the series $S$ then after some algebraic juggling we find
\begin{align} \label{eq:S}
S&=1-1+1-1+1-1+...\nonumber\\
1-S&=1-(1-1+1-1+1-1+...)\nonumber\\
&=1-1+1-1+1-1+...\nonumber\\
&=S\nonumber\\
1&=2S\nonumber\\
\Rightarrow S&=1/2
\end{align}
Furthermore, we can consider Grandi's series as an example of the infinite geometric series
\begin{align} \label{eq:geometricG}
\sum_{k=0}^{\infty}ar^k=a+ar+ar^2+ar^3...
\end{align}
where $a=1$ and $r=-1$. Even though Grandi's series is divergent, equation \ref{eq:geometricG} is convergent for $|r|<1$ and in that case
\begin{align} \label{eq:geometricC}
\sum_{k=0}^{\infty}ar^k=\frac{a}{1-r}.
\end{align}
If we substitute $a=1$ and $r=-1$ into equation \ref{eq:geometricC} then we again find $S=1/2$. Neither of these constitute solid proof in and of themselves, but they are highly suggestive. The tool we are looking for is the Cesàro sum.$^{10}$ A series is Cesàro summable when the mean value of its partial sums tends to a given value. For convergent series the Cesàro sum will always equal the number the series converges to and the Cesàro sum is defined for many divergent series too, including Grandi's series. The partial sums of Grandi's series are $1,0,1,0,...$ and so the terms in the Cesàro sequence are $1, 1/2, 2/3,1/2,3/5,1/2,4/7,...$ which clearly converges to $1/2$ in the limit.$^{11}$ In some sense this is quite a satisfying result as our Cesàro sum lies exactly in between the two accumulation points, serving as a kind of average value.
Having fun with zeta function regularisation
The partial sums of equation \ref{eq:natural} are the triangular numbers $T_n$ (so named because they give the numbers of objects that can be arranged into equilateral triangles) $1, 3, 6, 10, 15,...$, so we calculate the Cesàro sequence of equation \ref{eq:natural} and find it goes $1, 2, 10/3, 5, 7,...$ Once again, just as we stop to bask in our moment of triumph, we find our job isn't quite yet done; equation \ref{eq:natural} is not Cesàro summable. We must find another, more sophisticated method for attaching a number to that series.
Let us consider the series
\begin{align} \label{eq:dirichlet}
D(s)=\sum^{\infty}_{n=1}n^{-s}, \text{ Re}(s)>1
\end{align}
where $s$ is a complex number and $\text{Re}(s)$ denotes the real part of $s$. For $s=-1$ this series would be exactly the same as equation \ref{eq:natural}, but the series is not defined for $s=-1$.$^{12}$ However, we saw back in equation \ref{eq:geometricC} that on that occasion if we applied the convergent case equation to Grandi's series we got the result of $1/2$ that turned out to be right; perhaps we could do something similar here? As it would happen, we can, but first I would implore you to, in the great words of John Arnold, "Hold on to your butts".
In the domain $\text{Re}(s)>1$, $\zeta(s)=D(s)$ where $\zeta(s)$ is known as the Riemann zeta function.$^{13}$ Unlike $D(s)$, $\zeta(s)$ is defined over the entire complex plane and is known as an analytic continuation of $D(s)$. Analytic continuation is a wonderfully useful (and perplexing) tool of complex analysis whereby the domain of an analytic ('well-behaved') function can be extended. While this may not seem like a big deal, it raises the question as to whether or not a function can be continued arbitrarily; if our original function is only defined for some small domain and we wish to extend that domain, what is to stop us giving it such–and–such value in the extended domain instead of some other value? As it would happen, the identity theorem states (very roughly) that any two holomorphic functions (all complex analytic functions are holomorphic) that are equal to each other at some point in a given domain must be equal over the entire domain, and thus there is only one unique way to analytically continue a function.
We wish to know what $\zeta(-1)$ is so we can assign that value to $D(-1)$ through the magic of analytic continuation. For negative whole numbers $n<0$,
\begin{equation} \label{eq:negzeta}
\zeta(n)=-\frac{B_{1-n}}{1-n}
\end{equation}
where $B_{1-n}$ is the '$1-n$'$^{\text{th}}$ Bernoulli number.$^{14}$ In the case of $n=-1$ we have
\begin{equation} \label{eq:-1zeta}
\zeta(-1)=-\frac{B_2}{2}=-\frac{1}{12}.
\end{equation}
Thus we can assign to the divergent series $1+2+3+4+...$ the value of $-1/12$. If this strikes you as strange or even suspicious then I applaud your scepticism; there is indeed something plainly odd about assigning the value of a small, fractional negative number to a series of ever-increasing positive whole numbers. This is not at all like the neat case of Grandi's series where we had our value lying neatly between the accumulation points. But before we throw our hands up in despair, recall our motivation for this investigation, the Casimir effect. What happens if we use our value of $-1/12$ there?
As it would happen, to do so is to use a technique known as zeta function regularisation. We replace a divergent series with a 'regulator' in the form of a zeta function (although other regulators exist, each with different strengths and weaknesses) and in doing so remove unphysical infinities from our theory. If we have regularised correctly, then by the time we have reached our final result, the regulator will have disappeared—it is nothing more than a 'trick' for calculating the correct value and so it should not still appear at the last step.
\begin{equation} \label{eq:zetanorm}
E=\frac{\hbar\pi c}{2a}\sum^{\infty}_{n=1}n=\frac{\hbar\pi c}{2a}\zeta(-1)=-\frac{\hbar\pi c}{24a}.
\end{equation}
The force between the two plates is given by the negative gradient of the energy:
\begin{equation} \label{eq:force}
F=-\frac{\partial E}{\partial a}=-\frac{\partial}{\partial a}\left(-\frac{\hbar\pi c}{24a}\right)=-\frac{\hbar\pi c}{24a^2}
\end{equation}
which, lo and behold, is exactly the right result.
Final thoughts
The prompt for my writing this lengthy explanation of infinite series, divergences, regularisation and so on was a minor flurry on the Internet a little while ago due to a somewhat dodgy derivation of the result $1+2+3+4+...=-1/12$. In order to get this result a number of those forbidden–for–divergent–series operations were used for simplicity's sake, but in doing so I felt the important subtlety between a convergent series equalling a number and a divergent series being assigned a value (albeit in a rigorous way) was lost, and that is an important distinction to make. You cannot keep adding $1+2+3+4+...$ and then through the magic of infinity come up with a $-1/12$ at the end; that series will always diverge and will always approach $+\infty$, but as we have seen we can rigorously assign the value of $-1/12$ to it for the purposes of removing infinities from our calculations using, in this case, the Riemann zeta function.
During the conception of this post I did ponder a question which continues to interest me, though. We saw from the example of Grandi's series that there are some mathematical operations and manipulations which would be fine in 'normal' mathematics but which suddenly become
verboten when done in the specific context of a divergent series. The question is, Is this a fundamental property of the mathematics in question, that is to say, the underlying patterns and structures, or is it one emergent from notational limits? Is a rearrangment of terms in a divergent series actually fundamentally different to a rearrangment of terms in a convergent series, or is it the same thing that manifests different results in different contexts? I am not sure if this question can be answered sensibly, but for my money I am reminded of the old dichotomy in the philosophy of mathematics whose question is still yet to find an answer: Is mathematics discovered or invented? Now there is truly some food for thought.
Notes
$1$. There are other ways to use sigma notation. For example, equation \ref{eq:natural} can also be represented as $\sum_{n\leq 1}n$ ($n$ being a whole number is implied by the discrete sum being used instead of the continuous integtral) or $\sum_{n\in\mathbb{N}}n$. Other less common examples of sigma notation are $\sum_{p\text{ prime}}\frac{1}{p}$ which is the sum of the reciprocals of all prime numbers and $\sum_{d \vert n}d^x$ which is the divisor function, where '$d \vert n$' means $d$ divides $n$ exactly.
$2$. The sum shown here has the interesting property of being a series representation of the number 1, or in the given notation, $\sum_{n=1}^{\infty}\left(\frac{1}{2}\right)^n=1$. For any readers who are passingly familiar with Ancient Greek philosophy, this fact can be viewed as a solution to Zeno's dichotomy paradox, albeit one which ignores some nuances which I will address at a later date when I cover supertasks, one of the many intersections of philosophy and mathematics.
$3$. Formally, there exists a limit $S$ such that for any (arbitrarily small) number $\epsilon>0$ there is a number $N$ such that for $n>N$, $|S_n-S|<\epsilon$ where $S_n$ is the $n^{\text{th}}$ partial sum. Informally, there exists a number $S$ such that for an arbitrarily large $n$ the partial sum $S_n$ will be arbitrarily close to $S$.
$4$. The factorial operator is defined as $n!=n(n-1)(n-2)...1$, or in words, $n!$ (read '$n$ factorial') is given by the multiplication of $n$ by all the whole numbers less than $n$ going down to $1$. As an example, $5!=5\cdot4\cdot3\cdot2\cdot1=120$.
$5$. I will not delve here into the depths of conditional and absolute convergence, almost convergence, and so on. Suffice it to say that there are a great many interesting infinite series that have a great many interesting properties relating to convergence behaviour other than those simple ones shown here. If you are especially interested in the topic, I recommend investigating the Riemann series theorem for a very interesting ad surprising property of conditionally convergent series.
$6$. The harmonic series is deceptively interesting and there are many, many different and varied ways of calculating the harmonic numbers. One example straight out of equation \ref{eq:harmonic} is the recurrence relation (an equation which gives one term in a sequence in terms of a previous one) $H_n=H_{n-1}+1/n$. I encourage you to investigate others and see where it leads you!
$7$. The value of a field and the value of the derivative of the field at a given point in space cannot be known to arbitrary accuracy; the better one is known the less well the other must be. Though the uncertainty principle is often raised as a weird and wonderful result of quantum mechanics, in fact it is a feature of any wave theory and is linked intimately with Fourier transformations, although a thorough demonstration of how this is so is sadly beyond the scope of this note.
$8$. The complexities of quantum field theory should by no means be underestimated; what I am presenting here is an extraordinarily simplified version that, while instructive, would not necessarily be very useful in practice.
$9$. Standing waves are waves with nodes (points of zero displacement) at the endpoints. An example would be a plucked guitar string, which is retricted from moving at the bridge and nut. This restriction ensures the only possible wavelengths are given by $\lambda_n=na/2$ for length $a$ where $n$ is a natural number. Using the wave relationship $c=\nu_n\lambda_n$ we find equivalently $\nu_n=nc/2a$ (or equation \ref{eq:omega} as $\omega=2\pi\nu$) which gives the frequency $\nu$ for the $n^{\text{th}}$ harmonic.
$10$. Or rather, one of the tools, as we could equally have chosen the Abel sum, the Borel sum, the $1/x$ series method, or a number of others. Cesàro summation is far from the only rigorous way of dealing with Grandi's series, but what is important is that it gives a value of 1/2, as do the methods I listed above—this consistency is a big hint that we have picked the right number to assign to the series.
$11$. It is worth noting that if we 'dilute' the series by adding in $+0$s we change the value of the Cesàro sum (although the summability is not affected). This illuminates yet another mathematical manipulation which would be fine for a convergent series but is not for a divergent series.
$12$. Precisely because it would become divergent.
$13$. If $e$ is my favourite number then $\zeta(s)$ is surely my favourite function, but in exactly the same way I could not possibly hope to explain why except in another post devoted to it exclusively.
$14$. For fear of overwhelming you with yet more beautiful mathematics in an already over-long post I will avoid the temptation of discussing the Bernoulli numbers, although as ever I encourage the interested reader to investigate for themselves!