Skip to main content

Canonical (anti)commutation relations from the path integral

I like the path integral. The question of whether the path integral or canonical quantization is "more fundamental" is purely ideological, and impossible to answer. On the one hand, canonical quantization is rigorously defined in a much wider setting than the path integral; on the other hand, canonical quantization introduces "order ambiguities" in quantizing certain operators, and requires renormalization to define operators like $\phi(0)^2,$ while both of these issues are automatically avoided in the path integral. These advantages make me, personally, a path integral devotee.

One big advantage of canonical quantization over the path integral, however, is that it automatically gives you the canonical (anti)commutation relations between field operators and their conjugate momenta; in fact, this is basically the definition of canonical quantization: it gives a prescription for defining canonical (anti)commutators in terms of a bracket on the phase space of the classical theory.

If you want to study systems primarily using the path integral, however, then you should know how to derive canonical (anti)commutation relations from the path integral perspective. This short note explains how that calculation works for quantum mechanical systems with standard kinetic terms; similar considerations apply for quantum field theories, or to systems with nonstandard kinetic terms, and I think the basic quantum mechanical examples should suffice to explain the basic idea. The calculation has the flavor of a Ward identity calculation for the transformation where all fields are shifted by a constant — $\phi(x) \mapsto \phi(x) + c$ — except that this transformation is generally not a symmetry. We'll derive everything from scratch, though, without actually referencing Ward identities.

In section 1, I will derive canonical commutation relations for a single bosonic degree of freedom using the path integral.

In section 2, I will derive canonical anticommutation relations for $N$ fermionic degrees of freedom using the path integral.

Prerequisites: Path integral formulation of quantum mechanics. Section 2 requires some knowledge of Grassmann variables, but not much. Several times in this post I use the term "fields" to refer to the position function $x(t)$ in bosonic quantum mechanics, or the Grassmann-valued function $\psi(t)$ in fermionic quantum mechanics. This is because I tend to think of quantum mechanics as a one-dimensional QFT.

Table of Contents

  1. Canonical commutation relations
  2. Canonical anticommutation relations

1. Canonical commutation relations

Consider a single bosonic particle with the usual kinetic term. The classical theory is a theory of paths $x(t)$ with action
$$S[x] = \int dt\, \left( \frac{1}{2} m \dot{x}^2 - V(x) \right).$$
The variation of this action under a compact perturbation $x(t) \mapsto x(t) + \epsilon \delta x(t)$ is
$$\delta S = \int dt\, \left( - m \ddot{x} - V'(x) \right) \delta x, \tag{1}$$
which just gives us the standard equation of motion $m \ddot{x} = - V'(x)$ via the principle of stationary action.

If $P[x](t)$ and $Q[x](t')$ are functionals of $x(t)$ and its derivatives at time $t,$ then time-ordered correlation functions are computed in the path integral by the formal expression
$$\langle P[x](t) Q[x](t') \rangle = \frac{\int \mathcal{D} x\, e^{i S[x]} P[x](t) Q[x](t')}{\int \mathcal{D} x\, e^{i S[x]}}.$$
Note that I haven't written the time-ordering symbol explicitly; $\langle X \rangle$ will always denote a time-ordered correlation function, not the vacuum expectation value $\langle \Omega | X | \Omega \rangle.$

Now, suppose we consider the relabeling $y(t) = x(t) + \epsilon \rho(t),$ where $\rho(t)$ is some fixed, compactly supported function that we add to every possible path $x(t).$ This is just a change of variables in the path integral; it seems manifest that it should have trivial Jacobian, so we should have
$$\int \mathcal{D} x\, e^{i S[x]} = \int \mathcal{D} x\, e^{i S[y]}.$$
In particular, this means that the correlation functions should satisfy
$$\langle P[x](t) Q[x](t') \rangle = \frac{\int \mathcal{D} x e^{i S[y]} P[y](t) Q[y](t')}{\int \mathcal{D} x\, e^{i S[x]}}. \tag{2}$$
In the limit as $\epsilon$ goes to zero, we can expand $P$ as
$$P[y](t) = P[x](t) + \epsilon \left( \frac{\partial P}{\partial x}[x](t) \rho(t) + \frac{\partial P}{\partial \dot{x}}[x](t) \dot{\rho}(t) + \dots \right) + O(\epsilon^2),$$
and similarly for $Q$.

Let's package the whole parenthetical term and call it $\delta_{\rho} P[x](t).$ We can also write the corresponding variation in the action as $S[y] = S[x] + \epsilon \delta_{\rho} S[x].$ Equation (2) can then be written as
$$\begin{split} \langle P[x](t) Q[x](t') \rangle & = \langle P[x](t) Q[x](t') \rangle + \epsilon \langle \delta_{\rho}P[x](t) Q[x](t') \rangle + \epsilon \langle P[x](t) \delta_{\rho} Q[x](t') \rangle \\ & + i \epsilon \langle \delta_{\rho} S[x] P[x](t) Q[x](t') \rangle + O(\epsilon^2).\end{split}$$
Matching the order-$\epsilon$ terms on both sides of the equation gives us
$$\langle \delta_{\rho}P[x](t) Q[x](t') \rangle + \langle P[x](t) \delta_{\rho} Q[x](t') \rangle + i \langle \delta_{\rho} S[x] P[x](t) Q[x](t') = 0. \tag{3}$$
All that remains is to compute $\delta_{\rho} S[x]$, which is easily found via equation (1) to be
$$\delta_{\rho} S[x] = \int d\tilde{t}\, (- m \ddot{x} - V'(x)) \rho(\tilde{t}).$$

Now, suppose we take $\rho(\tilde{t})$ to be a step function that is equal to one in a neighborhood $[t - \alpha, t + \alpha],$ but zero everywhere else — in particular, zero at $t'.$ Then the integral expression for $\delta S$ simplifies to
$$\delta_{\rho} S[x] = - m (\dot{x}(t+\alpha) - \dot{x}(t-\alpha)) - \int_{t-\alpha}^{t+\alpha} d\tilde{t}\, V'(x).$$
If we plug this into equation (3), then the correlation function that includes $\delta_{\rho} S$ becomes
$$- i m \langle (\dot{x}(t + \alpha) - \dot{x}(t - \alpha)) P[x](t) Q[x](t') \rangle - \dots,$$
where I've suppressed the second term, which depends on $V.$ This first term, in the limit $\alpha \rightarrow 0$, is just the expectation value of the commutator:
$$- i m \langle [x(t), P[x](t)] Q[x](t')\rangle.$$
The time ordering takes care of this for us — in the term with the plus sign the limit $\alpha \rightarrow 0$ produces the operator $x(t) P[x](t),$ while in the term with the minus sign the limit produces the operator $P[x](t) x(t).$

The second term, the one that depends on $V$, is a little more complicated. A priori, we don't have a good way of understanding it. Naively, it looks like the operator $\int_{t-\alpha}^{t+\alpha} d\tilde{t}\, V'(x)$ goes to zero in the limit $\alpha \rightarrow 0,$ but we aren't considering that operator on its own; we're considering its insertion in the time-ordered correlator
$$- i \langle \int_{t-\alpha}^{t+\alpha} d\tilde{t}\, V'(x(\tilde{t})) P[x](t) Q[x](t') \rangle.$$
However, if we make the particular choice $P[x](t) = x(t),$ then since $V'(x)$ depends only on $x(\tilde{t})$ we know that $V'(x(\tilde{t}))$ and $P[x](t)$ commute; in this case, the $\alpha \rightarrow 0$ limit of this correlator actually does vanish.

So, with $P[x](t) = x(t),$ and with $\rho(\tilde{t})$ the aforementioned step function in the limit $\alpha \rightarrow 0,$ our identity (3) becomes
$$\langle Q[x](t') \rangle - i m \langle [\dot{x}(t), x(t)] Q[x](t') = 0,$$
where we have used $\delta_{\rho} x(t) = 1$ and $\delta_{\rho} Q[x](t) = 0.$ Since $m \dot{x}(t)$ is just the momentum operator $p(t),$ a little rearranging makes this equation into
$$\langle [x(t), p(t)] Q[x](t') \rangle = i \langle Q[x](t') \rangle.$$
Since this identity holds in correlation functions where $Q[x](t')$ is completely arbitrary, it must hold in the sense of the operator equation
$$[x(t), p(t)] = i.$$

2. Canonical anticommutation relations

We now consider a quantum mechanical system with $N$ fermionic degrees of freedom. The path integral formulation of such a theory is a bit unusual; the fields (i.e. the "positions" of the particles) are valued in a Grassmann algebra rather than in the real or complex numbers. I won't discuss Grassmann algebras in any detail here; all we need to know is that the algebra is generated by $N$ symbols $\{\chi_1, \dots, \chi_N\}$ that anticommute:
$$\chi_j \chi_k = - \chi_k \chi_j.$$
One can define a notion of integration on functions of these symbols, from which it is possible to construct a path integral for any Grassmann-valued Lagrangian. We won't discuss the details at all, as we won't need them for this discussion.

The standard Lagrangian for a system of $N$ "Grassmann variables" is
$$\mathcal{L}[\psi](t) = \frac{i}{2} \sum_{j=1}^{N} \psi_j(t) \dot{\psi}_j(t) - V(\psi), \tag{4}$$
where $V(\psi)$ is some function of all the fields $\psi_1(t), \dots, \psi_N(t).$ Note that the fundamental generators of the Grassmann algebra, the $\chi_j$ symbols we discussed earlier, are different from the fields $\psi_j(t)$ that appear in this expression; in general, we will have something like
$$\psi_j(t) = \sum_{k} c_{jk}(t) \chi_{k},$$
where $\{c_{jk}(t)\}$ is a set of real-valued functions.

Now, let $b(t)$ be a compactly supported, Grassmann-valued function; i.e., let it be of the form
$$b(t) = \sum_{j} \beta_j(t) \chi_j$$
with each $\beta_j$ compactly supported. Let's consider $N$ such functions, $\{b_j(t)\}.$ We can then perturb the Lagrangian by the infinitesimal perturbation
$$\psi_j(t) \mapsto \psi_j(t) + \epsilon b_j(t).$$
The discussion leading to equation (3) in the previous section then applies with only minor notational changes. For any functionals $P[\psi](t)$ and $Q[\psi](t')$ of the fields, we obtain the identity
$$\langle \delta_{b} P[\psi](t) Q[\psi](t') \rangle + \langle P[\psi](t) \delta_{b} Q[\psi](t') \rangle + i \langle \delta_{b} S[\psi] P[\psi](t) Q[\psi](t')\rangle = 0. \tag{5}$$
The variation of the action is easy to compute from the Lagrangian (4); it is given by
$$\delta_{b} S[\psi](t) = \frac{i}{2} \sum_{j=1}^{N} \int dt\, (b_j(t) \dot{\psi}_j(t) + \psi_j(t) \dot{b}_j(t)) - \sum_{j=1}^{N} \int dt\, b_j(t) \frac{\partial V}{\partial \psi_j}. \tag{6}$$
The second term requires a little bit of care; I haven't told you how to differentiate a function of Grassmann variables. But rest assured that there is a way to define differentiation — the so-called "left Grassmann derivative" — for which equation (6) holds.

Because $b_j(t)$ is compactly supported, we can apply integration by parts to the second term in parentheses to obtain
$$\delta_{b} S[\psi](t) = \frac{i}{2} \sum_{j=1}^{N} \int dt\, (b_j(t) \dot{\psi}_j(t) - \dot{\psi}_j(t) b_j(t)) - \sum_{j=1}^{N} \int dt\, b_j(t) \frac{\partial V}{\partial \psi_j}.$$
We then use the fact that $b_j(t)$ and $\dot{\psi}_j(t)$ anticommute to obtain
$$\delta_{b} S[\psi](t) = i \sum_{j=1}^{N} \int dt\, b_j(t) \dot{\psi}_j(t) - \sum_{j=1}^{N} \int dt\, b_j(t) \frac{\partial V}{\partial \psi_j}.$$
Now, let $b_j(t)$ be constant on the interval $[t-\alpha, t+\alpha],$ and zero elsewhere. Unlike in the bosonic case, we can't make it equal to one on this interval (since one is not an element of the Grassmann algebra); but we'll just make it equal to some arbitrary linear combination of the generators of the Grassmann algebra, which we'll denote by $b_j.$
 
Under this choice of $b_j(t),$ the variation in the action becomes 
$$\delta_{b} S[\psi](t) = i \sum_{j=1}^{N} b_j (\psi_j(t+\alpha) - \psi_j(t-\alpha)) - \sum_{j=1}^{N} b_j \int_{t-\alpha}^{t+\alpha} dt\, \frac{\partial V}{\partial \psi_j}.$$
In the small-$\alpha$ limit, we have $\delta_{\rho} Q[\psi](t') = 0.$ If we choose $P[\psi](t) = \psi_k(t),$ then we have $\delta_{b} P[\psi](t) = b_k(t),$ and in the small-$\alpha$ limit equation (5) becomes
$$\langle b_k Q[\psi](t')\rangle - \sum_{j=1}^{N} b_j \lim_{\alpha \rightarrow 0} \langle (\psi_j(t+\alpha) - \psi_j(t-\alpha)) \psi_{k}(t) Q[\psi](t') \rangle = 0.$$
In the bosonic case of the previous subsection, we said that in a time ordered correlator, the expression $$(A(t+\alpha) - A(t-\alpha)) B(t)$$
becomes, in the $\alpha \rightarrow 0$ limit,
$$[A(t), B(t)].$$
For fermionic fields, this isn't quite true; this is because the "classical" fields $\psi_j$ and $\psi_k$ anticommute; so we have
$$\psi_j(t + \alpha) \psi_k(t) - \psi_j(t - \alpha) \psi_k(t) = \psi_j(t + \alpha) + \psi_k(t) \psi_j(t-\alpha).$$
So in a time-ordered correlator, this actually gets replaced in the $\alpha \rightarrow 0$ limit by the anticommutator
$$\{\psi_j(t), \psi_k(t)\}.$$
Plugging this back in gives us the expression
$$\langle b_k Q[\psi](t')\rangle - \sum_{j=1}^{N} b_j \langle \{\psi_j(t), \psi_k(t)\} Q[\psi](t') \rangle = 0.$$
For this to hold for arbitrary functionals $Q[\psi](t')$, we must have the operator expression
$$b_k = \sum_{j=1}^{N} b_j \{\psi_j(t), \psi_k(t)\}.$$ 
But since the constants $\{b_j\}$ were completely arbitrary, we could easily pick them to be linearly independent, in which case we obtain
$$\{\psi_j(t), \psi_k(t)\} = \delta_{j k}.$$
This is the canonical anticommutation relation for fermions.

Note: Sometimes, especially in texts that start from an operator description of the theory rather than a Lagrangian description, you'll see the anticommutation relation $\{\psi_j(t), \psi_k(t)\} = 2 \delta_{jk}.$ This could be obtained from the path integral by making the kinetic term $i \sum_j \psi_j \dot{\psi}_j$ instead of $\frac{i}{2} \sum_j \psi_j \dot{\psi}_j.$

Comments

Popular posts from this blog

Envelopes of holomorphy and the timelike tube theorem

Complex analysis, as we usually learn it, is the study of differentiable functions from $\mathbb{C}$ to $\mathbb{C}$. These functions have many nice properties: if they are differentiable even once then they are infinitely differentiable; in fact they are analytic, meaning they can be represented in the vicinity of any point as an absolutely convergent power series; moreover at any point $z_0$, the power series has radius of convergence equal to the radius of the biggest disc centered at $z_0$ which can be embedded in the domain of the function. The same basic properties hold for differentiable functions in higher complex dimensions. If $\Omega$ is a domain --- i.e., a connected open set --- in $\mathbb{C}^n$, and $f : \Omega \to \mathbb{C}^n$ is once differentiable, then it is in fact analytic, and can be represented as a power series in a neighborhood of any point $z_*$, i.e., we have an expression like $$f(z) = \sum a_{k_1 \dots k_n} (z_1 - z_*)^{k_1} \dots (z_n - z_*)^{k_n}.$$ The ...

Pick functions and operator monotones

Any time you can order mathematical objects, it is productive to ask what operations preserve the ordering. For example, real numbers have a natural ordering, and we have $x \geq y \Rightarrow x^k \geq y^k$ for any odd natural number $k$. If we further impose the assumption $y \geq 0,$ then order preservation holds for $k$ any positive real number. Self-adjoint operators on a Hilbert space have a natural (partial) order as well. We write $A \geq 0$ for a self-adjoint operator $A$ if we have $$\langle \psi | A | \psi \rangle \geq 0$$ for every vector $|\psi\rangle,$ and we write $A \geq B$ for self-adjoint operators $A$ and $B$ if we have $(A - B) \geq 0.$ Curiously, many operations that are monotonic for real numbers are not monotonic for matrices. For example, the matrices $$P = \frac{1}{2} \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}$$ and $$Q = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}$$ are both self-adjoint and positive, so we have $P+Q \geq P \geq 0$, but a str...

Stone's theorem

 Stone's theorem is the basic result describing group-like unitary flows on Hilbert space. If the map $t \mapsto U(t)$ is continuous in a sense we will make precise later, and each $U(t)$ is a unitary map on a Hilbert space $\mathcal{H},$ and we have $U(t+s)=U(t)U(s),$ then Stone's theorem asserts the existence of a (self-adjoint, positive definite, unbounded) operator $\Delta$ satisfying $U(t) = \Delta^{it}.$ This reduces the study of group-like unitary flows to the study of (self-adjoint, etc etc) operators. Quantum mechanically, it tells us that every group-like unitary evolution is generated by a time-independent Hamiltonian. This lets us study very general symmetry transformations in terms of Hamiltonians. The standard proof of Stone's theorem, which you'll see if you look at Wikipedia , involves trying to make sense of a limit like $\lim_{t \to 0} (U(t) - 1)/t$. However, I have recently learned of a beautiful proof of Stone's theorem that works instead by stud...