Processing math: 100%
Skip to main content

Wigner's theorem

Wigner's theorem is one of the most fundamental results in quantum theory, but I somehow didn't hear of it for the first time until the third year of my PhD. Even then, it took me another year or so to fully appreciate the theorem's importance. I suspect this experience is common — Wigner's theorem is thought of as being fairly technical or mathematical, and doesn't get covered in most quantum mechanics courses. But because it's so essential, I'd like to dedicate a post to explaining and proving it.

The statement of the theorem is simple: every symmetry of a quantum system can be represented as a unitary or anti-unitary operator on Hilbert space. (Here we are implicitly thinking about quantum states as vectors in a Hilbert space — there are ways of thinking about Wigner's theorem from a more operator-algebraic point of view, but the Hilbert space picture is a good place to start.) The reason Wigner's theorem is so valuable is that if we believe a system has a symmetry — for example, rotational invariance — then we know there must be corresponding unitary or anti-unitary operators in the theory. Our task, in understanding how symmetries act on a given quantum system, is then to find those operators explicitly.

Two preliminaries are needed to even explain the statement of the theorem: (1) What is the mathematical definition of a quantum symmetry? (2) What is an anti-unitary operator? In the first two sections, I answer these questions, and also introduce how to think about the state space of quantum mechanics as a projective space. In the last subsection, I provide a simple proof of Wigner's theorem following Géher. (Unlike Géher, however, I will be happy to assume that Hilbert space is separable and that symmetries are bijective. More on that later.)

Prerequisites: Only elementary quantum mechanics.

Table of Contents

  1. Symmetries and Projective Space
  2. Unitary and Anti-unitary Operators
  3. Wigner's Theorem

Symmetries and Projective Space

What is a (pure) quantum state? As a preliminary answer, we are usually told that a quantum state is a unit-norm vector |ψ in a Hilbert space H. This isn't quite right, though — the only quantities in a quantum system accessible to experiment are transition probabilities
T(|ψ,|ϕ)=|ψ|ϕ|2.
These amplitudes are invariant under the transformation |ψeiα|ψ. This is the source of the familiar statement that "pure states are defined only up to a phase." Conceptually, since we must allow this ambiguity in the definition of a quantum state, it will be convenient to also allow pure states to be unnormalized. To deal with this, we update our definition of transition probabilities to:
T(|ψ,|ϕ)|ψ|ϕ|2ψ|ψϕ|ϕ.
 
With this convention, any two nonzero vectors in H that are related by multiplication by any complex scalar are physically equivalent. So a pure quantum state should really be thought of as a one-dimensional subspace of H — or, more precisely, a one-dimensional subspace where the zero vector has been removed. With C× denoting the space C{0}, a quantum state is best thought of as the set of physically equivalent vectors {λ|ψ|λC×}. Formally, we define an equivalence relation between nonzero vectors in H:
|ψ|ϕ|ψ=λ|ϕ,λ0,
and define quantum states to be equivalence classes of vectors under this relation:
[|ψ]={λ|ψ|λC×}.

The set of all such equivalence classes in a vector space has a name: projective space. We denote it P(H), and define it formally by
P(H)={[|ψ]||ψ(H{0})}.
This is equivalently realized as the quotient space
P(H)=(H{0})/,
where is the equivalence relation defined above. One advantage of thinking of projective space as a quotient space is that it equips P(H) with a topology — H has a topology induced by the norm associated with the inner product, and P(H) has the corresponding quotient topology. (I.e., a set of equivalence classes in P(H) is open if and only if the set of all corresponding elements in H is open.)
 
So, what is a symmetry? Physically, a symmetry is something we can do to a physical system that does not change the physics. Because it's something we are "doing to the system," it ought to be a map on the space of states. Let's call this map S:P(H)P(H). Because it "doesn't change the physics," all the transition amplitudes should be unchanged:
T([|ψ],[|ϕ])=T(S[|ψ],S[|ϕ]).
It's very important to remember that S isn't a map on the vector space H, but on the projective space P(H). Projective space has no vector space structure, so we shouldn't even expect S to be linear. The beauty of Wigner's theorem is that there will always exist some unitary (or anti-unitary) map ˆS:HH that agrees with the action of S on P(H).

Unitary and Anti-unitary Operators

A unitary operator is, by definition, a linear map U:HH that leaves the inner product unchanged:
Uψ|Uϕ=ψ|ϕ.
In proving Wigner's theorem, however, we will see that the map ˆS:HH is not guaranteed to be linear. In fact, one of the most fundamental symmetries of nature, the CPT symmetry of four-dimensional quantum field theory, is not realizable as a linear map! (It is, I believe, linear in Euclidean signature, but the time-reversal in CPT becomes antilinear upon Wick rotating to Minkowski space. This is because in Euclidean signature, CPT is continuously connected to the identity — see section 5.1 of Witten's notes on entanglement in quantum field theory.)

So we need a modest generalization of unitarity to allow for antilinear maps. An antilinear map on H is an R-linear map that acts by conjugation on the imaginary unit i. I.e., if M is antilinear then we have
M((a+ib)|ψ)=(aib)M(|ψ).
While the word "operator" is often used to refer to linear maps on vector spaces, we will use it to refer to antilinear maps as well. (Often saying, explicitly, "linear operator" or "antilinear operator.")

What does it mean for an antilinear map to "leave the inner product unchanged"? As a preliminary definition, we might guess that an antilinear map V:HH should be called anti-unitary if it satisfies
Vψ|Vϕ=ψ|ϕ.
But no such map can exist. The reason is that the left-hand side of this equation is antilinear in |ϕ, while the right-hand side is linear in |ϕ. If we replaced |ϕ by i|ϕ, we would have an inconsistency. To rectify this, we add an extra complex conjugation, and say that the antilinear map V is anti-unitary if it satisfies
Vψ|Vϕ=¯ψ|ϕ=ϕ|ψ.

Wigner's Theorem

We will now make a few assumptions that will simplify the proof of Wigner's theorem. The first is that the Hilbert space H is separable. This means that there exists a countable, orthonormal basis {|ej}j=1. This is a standard assumption in quantum theory. We will also assume that the symmetry transform S is bijective — i.e., that it can be inverted on P(H). This is the case for elements of symmetry groups, which are of most fundamental interest. Put simply: if your symmetry is rotating a system by 90, you can equally well rotate it back.

As a matter of notation, maps acting on H will henceforth be denoted with hats — for example ˆM — and the induced maps on P(H) will be denoted without hats. So if ˆM is an operator mapping H to itself, then M:P(H)P(H) is defined by
M([|ψ])=[ˆM|ψ].
The map M is well defined on P(H) for both linear and antilinear operators ˆM. This "hat notation" is consistent with the statement of the theorem: given a symmetry transformation S on P(H), we seek a linear or antilinear operator ˆS:HH that descends to S under the quotient.
 
The basic idea of the proof is very simple. For any unitary or anti-unitary operator ˆU on H, the corresponding map U on P(H) is a symmetry transform. Because symmetry transforms form a group, US is also a symmetry transform. The proof of Wigner's theorem works by an iterative procedure of simplifying the symmetry transform S by precomposing good unitary or anti-unitary maps. We will find a unitary operator ˆU so that US keeps a given orthonormal basis fixed, then find a unitary or anti-unitary operator ˆV so that V1US must act as the identity on P(H). Once we have done this, we will know that the action of U1V equals S on P(H), which implies that ˆSˆU1ˆV implements S on H. Because ˆU is unitary and ˆV is unitary or anti-unitary, the composition ˆU1ˆV is unitary or anti-unitary and the proof is complete.

To start, we fix a countable orthonormal basis {|ej}, and ask, "what does S do to [|ej]?" We know it maps to some equivalence class of states in P(H); let |χj be an arbitrary unit vector in that equivalence class, so that we have
S([|ej])=[|χj].
Since S is a symmetry transform, we have
|χj|χk|2=|ej|ek|2=δjk,
so the set |χj is orthonormal. Because S is invertible, we also know that {χj} forms a basis — for any |ψH, we can pick a unit vector |ϕ in S1([|ψ]), and write
ϕ|ϕ=j|ϕ|ej|2=ϕ|ϕjT(|ϕ,|ej).
Applying the symmetry transform, and using S([|ϕ])=[|ψ], we obtain
1=jT(|ψ,|χj)=ψ|(j|χjχj|)ψψ|ψ.
This equality can only hold for all vectors |ψH if  j|χjχj| acts as the identity on H, i.e., if {|χj} is a basis.

Now, let ˆU be the unitary operator on H that acts as ˆU|χj=|ej. The map US on P(H) is a symmetry transform that acts as the identity on the set {[|ej]}. It is not, however, necessarily the identity transform; while US acts as the identity on [|ej], it does not necessarily act as the identity on general states [jcj|ej]. So we seek a unitary or antiunitary operator V so that V1US fixes all equivalence classes of linear combinations of {|ej} vectors. We will start with two simple sets of equivalence classes:
[|αj]=[|ej+|ej+12]
and
[|βj]=[|eji|ej+12].
The transition amplitudes of these equivalence classes with each other and with [|ej] are given by:
T([|ek],[|αj])=T([|ek],[|βj])=δk,j+δk,j+12,
T([|αk],[|βj])=2δk,j+δk,j+1+δk,j14,
T([|αk],[|αj])=T([|βk],[|βj])=4δk,j+δk,j+1+δk,j14.
Since US is a symmetry transform, it must keep all of these amplitudes fixed. This is quite restrictive! In particular, if |αj is a unit vector in the equivalence class (US)([|αj]), then because (US) fixes [|ej], equation (1) implies that |αj can be decomposed as
|αj=z|ej+w|ej+12
with |z|=|w|=1. Since the choice of unit vector |αj within the class is arbitrary up to a phase, we might as well choose the global phase so that we can write |αj as
|αj=|ej+eiωj|ej+12.
Similarly, one can show there exists a unit vector |βj in the equivalence class (US)([|βj]) with decomposition
|βj=|ej+eiζj|ej+12.

The transition amplitudes of these states with one another are
T([|αk],[|αj])=T([|βk],[|βj])=4δk,j+δk,j+1+δk,j14,
T([|αk],[|βj])=2(1+cos(ζjωk))δk,j+δk,j+1+δk,j14.
Equation (4) is already consistent with equation (3); but in order for equation (5) to be consistent with equation (2), we must have ζj=ωj±π/2mod2π.
 
A priori, it looks like the signs in these π/2 constants are independent for the different values of j; for example, we could have ζ1=ω1+π/2 and ζ2=ω2π/2. This is actually not the case! Suppose, toward contradiction, that for some fixed j we have ζj=ωj+π/2 and ζj+1=ωj+1π/2. (For completeness we would need to also check the case ζj=ωjπ/2 and ζj+1=ωj+1+π/2, but the argument is the same.) Fix then a vector
|v=a|ej+b|ej+1+c|ej+2
with |a|2+|b|2+|c|2=1. Fix also a unit vector |v in (US)([|v]). The transition amplitudes of |v with |ej,|αj, and |βj are
T([|v],[|ek])=|a|2δk,j+|b|2δk,j+1+|c|2δk,j+2,
T([|v],[|αk])=|a+b|2δk,j+|b+c|2δk,j+1+|c|2δk,j+2+|a|2δk,j12,
T([|v],[|βk])=|a+ib|2δk,j+|b+ic|2δk,j+1+|c|2δk,j+2+|a|2δk,j12.
The first of these equations implies that, after redefining |v by a global phase, we may write
|v=aeiρ|ej+b|ej+1+ceiτ|ej+2.
The transition amplitudes of this state with |αk and |βk are
T([|v],[|αk])=|a+bei(ρ+ωk)|2δk,j+|b+cei(τωk)|2δk,j+1+|c|2δk,j+2+|a|2δk,j12,
T([|v],[|βk])=|a+bei(ρ+ζk)|2δk,j+|b+cei(τζk)|2δk,j+1+|c|2δk,j+2+|a|2δk,j12.
Consistency of these equations with (6) and (7) imposes |a+b|=|a+bei(ρ+ωj)|, |b+c|=|b+cei(τωj+1)|, |a+ib|=|a+bei(ρ+ζj)|, and |b+ic|=|b+cei(τζj+1)|. Putting in our assumption ζj=ωj+π/2 and ζj+1=ωj+1π/2, we obtain the equations
|a+b|=|a+bei(ρ+ωj)|,
|a+ib|=|aibei(ρ+ωj)|,
|b+c|=|b+cei(τωj+1)|,
|b+ic|=|b+icei(τωj+1)|. 
With a bit of manipulation, one can show these equations imply τ=ωj+1 and ab=abei(ρ+ωj). With a different set of constants
|˜v=˜a|ej+˜b|ej+1+˜c|ej+2
and a vector |˜v in (US)([|˜v]) given by
|˜v=˜aei˜ρ|ej+˜b|ej+1+˜cei˜τ|ej+2,
the condition |v|˜v|2=|v|˜v|2 gives
a˜a=a˜aei(˜ρρ),
where we have used τ=˜τ=ωj+1. But this equation is inconsistent with ab=abei(ρ+ωj) and ˜a˜b=˜a˜bei(˜ρ+ωj). For example, if we choose a to be nonzero and real, b to be nonzero and purely imaginary, and ˜a and ˜b to be nonzero and real, then we obtain
1=ei(˜ρρ),
1=ei(ρ+ωj),
and
1=ei(˜ρ+ωj). 
But these expressions are inconsistent; in order for them to be true, we would need ˜ρ=ρ, ρ=πωj, and ˜ρ=ωj to all be simultaneously true mod 2π; this is impossible. So, at the end of this fairly lengthy paragraph, we find that the sign in the expression ζj=ωj±π/2 is j-independent.
 
To recap, we have shown now that for any symmetry transform S, there is a unitary operator ˆU on H such that (US) satisfies
(US)([|ej])=[|ej],
(US)([|ej+|ej+12])=[|ej+eiωj|ej+12],
and
(US)([|eji|ej+12])=[|ej±ieiωj|ej+12],
where the choice of plus or minus sign isn't up to us — it depends on the particular symmetry transform S — but is j-independent.
 
Now, let ˆV be a map on H that maps |ej to  exp[ijk=1ωk]|ej. If equation (10) holds with the minus sign, then we extend V to all of H linearly and it becomes a unitary operator. If equation (12) holds with the plus sign, then we extend ˆV to H antilinearly and it becomes an anti-unitary operator. In either case, we have
V([|ej])=[exp[ijk=1ωk]|ej]=[|ej],
V([|ej+|ej+12])=[exp[ijk=1ωk]|ej+eiωj+1|ej+12]=[|ej+eiωj+1|ej+12],
V([|eji|ej+12])=[exp[ijk=1ωk]|ejieiωj+1|ej+12]=[|ejieiωj+1|ej+12],
where in this last equation the minus sign arises when V is linear, and the plus sign arises when V is antilinear. Combining these equations with (10), (11), and (12), we see that the symmetry transform V1US fixes the equivalence classes [|ej], [(|ej+|ej+1)/2], and [(|eji|ej+1)/2] in P(H). This turns out to be enough to imply that it acts as the identity on P(H)!

To see this, let |ψ=jcj|ej be an arbitrary unit vector in H, and let |ψ be a unit vector in the equivalence class (V1US)([|ψ]). As in all of our previous examples, the fact that V1US preserves the classes [|ej] implies |ej|ψ|2=|ej|ψ|2, which implies
|ψ=jcjeiϕj|ej.
The preservation of the transition amplitudes of |ψ with (|ej+|ej+1)/2 implies
|cj+cj+1|=|cj+cj+1ei(ϕj+1ϕj)|,
and preservation of transition amplitudes of |ψ with (|eji|ej+1)/2 implies
|cj+icj+1|=|cj+icj+1ei(ϕj+1ϕj)|.
Elementary manipulations of these equations yield the identity
cjcj+1(1ei(ϕj+1ϕj))=0.
When all of the cj coefficients are nonzero, this implies that all of the phases ϕj are equal, which gives [|ψ]=[|ψ], and thus that (V1US) acts as the identity on [|ψ]. But the set of vectors |ψ where all of the coefficients cj are nonzero is dense in H; by taking limits appropriately, this implies that (V1US) acts as the identity on all of P(H).

We're done! Having shown that there exists a unitary operator ˆU and a unitary or anti-unitary operator ˆV so that (V1US) acts as the identity on P(H), we have shown that ˆU1ˆV is a unitary or anti-unitary operator on H whose action is consistent with the action of S on P(H).

Comments

Popular posts from this blog

Pick functions and operator monotones

Any time you can order mathematical objects, it is productive to ask what operations preserve the ordering. For example, real numbers have a natural ordering, and we have xyxkyk for any odd natural number k. If we further impose the assumption y0, then order preservation holds for k any positive real number. Self-adjoint operators on a Hilbert space have a natural (partial) order as well. We write A0 for a self-adjoint operator A if we have ψ|A|ψ0 for every vector |ψ, and we write AB for self-adjoint operators A and B if we have (AB)0. Curiously, many operations that are monotonic for real numbers are not monotonic for matrices. For example, the matrices P=12(1111) and Q=(0001) are both self-adjoint and positive, so we have P+QP0, but a str...

Envelopes of holomorphy and the timelike tube theorem

Complex analysis, as we usually learn it, is the study of differentiable functions from C to C. These functions have many nice properties: if they are differentiable even once then they are infinitely differentiable; in fact they are analytic, meaning they can be represented in the vicinity of any point as an absolutely convergent power series; moreover at any point z0, the power series has radius of convergence equal to the radius of the biggest disc centered at z0 which can be embedded in the domain of the function. The same basic properties hold for differentiable functions in higher complex dimensions. If Ω is a domain --- i.e., a connected open set --- in Cn, and f:ΩCn is once differentiable, then it is in fact analytic, and can be represented as a power series in a neighborhood of any point z, i.e., we have an expression like f(z)=ak1kn(z1z)k1(znz)kn. The ...

Some recent talks (Summer 2024)

My posting frequency has decreased since grad school, since while I'm spending about as much time learning as I always have, much more of my pedagogy these days ends up in papers. But I've given a few pedagogically-oriented talks recently that may be of interest to the people who read this blog. I gave a mini-course on "the algebraic approach" at Bootstrap 2024. The lecture notes can be found here , and videos are available here . The first lecture covers the basic tools of algebraic quantum field theory; the second describes the Faulkner-Leigh-Parrikar-Wang argument for the averaged null energy condition in Minkowski spacetime; the third describes recent developments on the entropy of semiclassical black holes, including my recent paper with Chris Akers . Before the paper with Chris was finished, I gave a general overview of the "crossed product" approach to black hole entropy at KITP. The video is available here . The first part of the talk goes back in ti...