Ward Identities

Ward identities are one of the most fundamental tools for studying quantum field theory, and they're encountered in almost any quantum field theory course. You've almost certainly encountered them before, so why should I bother writing about them? Simply put: despite learning how to derive Ward identities for the first time more than 5 years ago (in my first quantum field theory class, as an undergraduate at UChicago), I didn't really understand why they were important until quite recently. This is a product of my own unique research path — I haven't ever done any research in pure QFT, working instead mostly in quantum information and classical geometry, which means I haven't ever had to really understand what's going on under the hood in field theory. I don't think this oversight is so uncommon, so I'm putting together some basic thoughts on Ward identities in this post.

So, what is a Ward identity? On its face, it's an equation that tells you how classical conservation laws break down under quantization. We have some classically conserved current, $\partial_{\mu} j^{\mu} = 0,$ and the (non-anomalous) Ward identity associated with that current is
$$\langle \partial_{\mu} j^{\mu}(x) X_1(x_1) \dots X_n(x_n) \rangle = \sum_{k=1}^{n} \delta(x - x_k) \delta_{k} \langle X_1(x_1) \dots X_n(x_n)\rangle.$$
The symbol $\delta_k$ appearing on the right-hand side of this equality is supposed to mean "within the following correlation function, replace the operator $X_k(x_k)$ with its variation under the infinitesimal symmetry associated with the current $j^{\mu}.$"

At a first glance, the only information in equation (1) seems to be a violation of current conservation by "contact terms." The delta functions on the right-hand side mean that as an operator, $\partial_{\mu} j^{\mu}(x)$ vanishes except in correlation functions that contain other operators at the point $x.$ My first reaction to this was: so what? It seems intuitive that classically conserved currents should be quantumly conserved, and the fact that they fail to be conserved in correlators with coincident operators seems like a minor technical detail. But a Ward identity actually contains a great deal more information than that; in particular, the real power of a Ward identity comes out only once you take a volume integral of the point $x$ over a region $R.$ When you do this, you get the new identity
$$\langle \int_{\partial R} j^{\mu}(x) X_1(x_1) \dots X_n(x_n) \rangle = \sum_{x_k \in R} \langle X_1(x_1) \dots \delta X_k(x_k) \dots X_n(x_n)\rangle.$$
So what we really learn, from the ward identity, is a complete specification of the "surface operator" $Q_{\partial R} = \int_{\partial R} j^{\mu}(x),$ for any region $R.$ Knowing the correlation functions of an operator $Q_{\partial R}$ against an arbitrary set of operators $X_1(x_1) \dots X_n(x_n)$ is equivalent to knowing how the operator acts on any state in Hilbert space. This is the power of the Ward identity
— it tells you anything you might like to know about a large class of operators $Q_{\partial R}$ associated with the symmetry whose current is $j^{\mu}.$

The rest of this post will deal with these issues in more detail. In section 1, I will give a very general treatment of Noether's theorem in Lagrangian field theory using differential forms. (I will consider only tensor field theories without internal gauge groups; more on that later.) Section 1 is fairly mathy, because it is written in generality for field theories on arbitrary curved manifolds; you can skip it if you're happy with picking up a bit of notation for the presentation of Noether's theorem in the sequel. In section 2, I will give the standard derivation of Ward identities from the path integral for arbitrary quantum field theories; I will include potential contributions from anomalies, and comment on their importance. In section 3, I will explain how, in flat Minkowski or Euclidean space, non-anomalous Ward identities can be used to derive the commutators of associated charge operators with arbitrary fields.

For additional reading, I recommend section 4.2.2 of David Tong's notes on conformal field theory, and chapter 2.4 of the big yellow book on conformal field theory. I found this paper of Lee and Wald useful for thinking abstractly about the Lagrangian formulation of field theories. While preparing this post, I found Kartik Prabhu's paper on the first law of black hole mechanics for gauge fields very helpful for thinking about the proper mathematical formulation of gauge theory Lagrangians, but ultimately decided to remove all discussion of gauge theories from this post because it felt a bit too far afield of the main points. I also exchanged some emails with Kartik while writing this post that were helpful to my understanding of gauge fields and spinor fields.

Prerequisites: Path integrals in quantum field theory. Section 1 requires background in differential forms, and it would probably be helpful to know general relativity. Section 3 requires understanding what it means to prepare a state with the path integral in quantum field theory; I like Tom Hartman's notes for this.

1. Noether's Theorem in Lagrangian Field Theory

(The first three paragraphs of this section talk a bit about fiber bundles, but understanding the details isn't necessary for understanding the rest of the post. If you aren't comfortable with fiber bundles, just let them wash over you and move on.)

For the purposes of this post, a field on a manifold $\mathcal{M}$ will be a vector bundle with $\mathcal{M}$ as its base space; a field configuration is a section of that bundle. For example, a scalar field on $\mathcal{M}$ is formalized via the trivial bundle $\mathcal{M} \times \mathbb{R},$ with a field configuration being a smooth map $\phi : \mathcal{M} \rightarrow \mathbb{R}.$ Field theory is most often taught as though all fields have trivial bundle structure, i.e., it is assumed that field configurations are maps from $\mathcal{M}$ into some target space $T.$ There are plenty of important fields with nontrivial bundle structure, however — for example, tangent vector fields on smooth manifolds live in the tangent bundle, which does not have a product structure if $\mathcal{M}$ is topologically nontrivial.

This definition is broad enough to encompass quite a large class of physically important fields. In particular, it includes arbitrary tensor fields. Unfortunately, it is not broad enough to include gauge connections or fields with gauge symmetry (such as spinors), which are defined as bundles over a principal bundle $P \rightarrow \mathcal{M},$ whose sections cannot be realized as sections of a bundle over $P.$ Gauge theories are more complicated, and will not be considered in this post.

Put more simply: in this post, all fundamental fields are tensor fields. The direct sum of vector bundles is itself a vector bundle, so the configurations of all fields on a manifold can be written compactly as $\phi^{A}(x),$ where the index $A$ is valued in the fiber of the "sum bundle" over $x.$ In more pedestrian terms, the index $A$ ranges over all components of all tensor fields in the theory.

Now, suppose that $\mathcal{M}$ is an $n$-dimensional, orientable manifold with a collection of fields $\phi^{A}(x).$ A Lagrangian is a $n$-form $\mathbf{\mathcal{L}}$ that is a local function of the fields and finitely many derivatives of those fields. Let's break that statement down. First, a Lagrangian should be an $n$-form because it should be possible to integrate it over $\mathcal{M}$ to obtain an action. Because any two $n$-forms on a manifold differ by multiplication by a smooth function, however, if we choose any volume form $\boldsymbol{\epsilon}$ on $\mathcal{M}$ then we may write
$$\mathbf{\mathcal{L}} = L \boldsymbol{\epsilon},$$
where $L$ is a smooth, real function on $\mathcal{M}.$ (This is why Lagrangians are usually written as functions — choosing a metric on $\mathcal{M}$ picks out a preferred volume form $\boldsymbol{\epsilon},$ and thus a preferred way of writing the differential form $\mathbf{\mathcal{L}}$ as a function $L$.)

I've also said that $\mathbf{\mathcal{L}}$ should depend on only finitely many derivatives of the fields. There's a problem with this statement, though: I haven't yet told you how to take derivatives of the fields on $\mathcal{M}$! It turns out that it doesn't really matter. Let ${}^{(1)} \nabla_{a}$ be a covariant derivative operator on our bundle of fields, i.e., a linear operator from field configurations to field-valued one-forms satisfying the product rule
$${}^{(1)} \nabla_{a} (f \phi^{A}) = (df)_{a} \phi^{A} + f \cdot {}^{(1)} \nabla_{a} \phi^{A}$$
for any smooth function $f : \mathcal{M} \rightarrow \mathbb{R}.$ Let ${}^{(2)} \nabla_{a}$ be a covariant derivative operator on the bundle of field-valued one-forms, i.e., a linear map from field-valued one-forms to field-valued two-forms satisfying a similar product rule. (This is necessary because while we specified how ${}^{(1)} \nabla_{a}$ acts on fields $\phi^{A},$ we didn't specify how it acts on fields like $\omega_{a}{}^{A}.$) This lets us take second derivatives as ${}^{(2)} \nabla_{a} {}^{(1)} \nabla_{b} \phi^{A}.$ Specify $k$ such derivatives, up to ${}^{(k)}\nabla_{a}$, and now assume that the Lagrangian takes the form
$$\mathbf{\mathcal{L}}(x) = \mathbf{\mathcal{L}}[\phi^{A}(x), {}^{(1)} \nabla_{a_1} \phi^{A}(x), \dots, {}^{(k)} \nabla_{a_k} \dots {}^{(1)} \nabla_{a_1} \phi^{A}(x)].$$
We then say that $\mathbf{\mathcal{L}}$ depends on finitely many of the fields $\phi^{A}(x)$ and the derivatives ${}^{(1)} \nabla_{a}, \dots, {}^{(k)} \nabla_{a}.$ But if we choose a different set of derivative operators, ${}^{(1)} \tilde{\nabla}_{a}, \dots, {}^{(k)} \tilde{\nabla}_{a},$ then these are related to the original set of derivative operators by generalizations of Christoffel symbols. More specifically, if we fix a local basis $(e_j)^{A}$ for the fields, and write $\phi^{A}(x)$ as $\phi^{A} = \sum_j f_j(x) (e_j)^{A}$, then we have
$${}^{(1)} \nabla_{a} \phi^{A}(x) - {}^{(1)} \tilde{\nabla}_{a} \phi^{A}(x) = \sum_j f_j(x) \left[ {}^{(1)} \nabla_{a} (e_j)^{A} - {}^{(1)} \tilde{\nabla}_{a} (e_j)^{A}\right].$$
The left-hand side is basis-independent, so the right-hand side must be as well. The right-hand side depends on the value of $\phi^{A}$ only at the point $x,$ so the left hand side must only depend on $\phi^{A}$ at that point as well. So ${}^{(1)} \nabla_a - {}^{(1)} \tilde{\nabla}_{a}$ is a local linear map from fields to field-valued one-forms, i.e., it is a tensor:
$$({}^{(1)} \nabla_a - {}^{(1)} \tilde{\nabla}_{a}) \phi^{A} = {}^{(1)} \Gamma^{A}{}_{a C} \phi^{C}.$$

The point of all this is that the difference between two derivative operators is expressible in terms of a local field. So if a Lagrangian depends on finitely many $\nabla$-type derivatives, it will also depend on only finitely many $\tilde{\nabla}$-type derivatives provided that we include the conversion tensor among our local fields! Let me restate this, for emphasis: the statement that a Lagrangian depends on finitely many derivatives of the fields is independent of our notion of derivative, so long as we expand our collection of fields to include the tensors used to translate between different covariant derivatives. For the usual Levi-Civita derivative used in general relativity, the one satisfying $\nabla_{a} g_{bc} = 0$, this is the like the statement that a Lagrangian that depends on finitely many Levi-Civita derivatives of a field also only depends on finitely many coordinate derivatives in any coordinate system, provided that we include the metric as one of the local fields of the theory.

OK; that's enough formalism. Let's get to Noether's theorem. For any point in $\mathcal{M}$, and for any system of coordinates in a neighborhood of that point, there is a collection of local fields for which the Lagrangian takes the form
$$\mathbf{\mathcal{L}}(x) = L[\phi^{A}(x), \partial_{\mu_1} \phi^{A}(x), \dots, \partial_{\mu_1} \dots \partial_{\mu_k} \phi^{A}(x)] \boldsymbol{\epsilon}(x),$$

where the volume form $\boldsymbol{\epsilon}$ that appears in this expression is the flat volume form in our chosen coordinate system, and therefore has no dependence on the field configuration. We can always do this, once we've picked a particular coordinate system, by adding local fields that convert between covariant and coordinate derivatives as in the preceding paragraph. (One advantage of using coordinate derivatives is that they commute, which makes the following presentation more notationally simple.) We will also fix a local basis for the field configurations, which will allow us to replace the abstract index $A$ with a concrete coordinate index $\alpha$:
$$\mathbf{\mathcal{L}}(x) = L[\phi^{\alpha}(x), \partial_{\mu_1} \phi^{\alpha}(x), \dots, \partial_{\mu_1} \dots \partial_{\mu_k} \phi^{\alpha}(x)] \boldsymbol{\epsilon}(x).$$

Now, suppose we consider a smooth family of field configurations $\phi^{\alpha}(x, \lambda),$ i.e., a smooth map from $\mathbb{R}$ into the space of field configurations. This is the natural setting for considering "infinitesimal perturbations" of field configurations, which are realized by the one-parameter family $\phi^{\alpha}(x, \lambda) = \phi^{\alpha}(x) + \lambda \delta \phi^{\alpha}(x)$ in the limit $\lambda \rightarrow 0.$ The Lagrangian then becomes a function of $\lambda$, and its derivative can be written using the chain rule as
$$\frac{\partial}{\partial \lambda} \mathbf{\mathcal{L}}(x, \lambda) = \sum_{j=0}^{k} \left( \frac{\partial L}{\partial (\partial_{\mu_{1}} \dots \partial_{\mu_{j}} \phi^{\alpha})}(x, \lambda) \right) \left(\partial_{\mu_1} \dots \partial_{\mu_j} \frac{\partial \phi^{\alpha}}{\partial \lambda}(x, \lambda) \right)\boldsymbol{\epsilon}(x).$$
By successively applying the product rule to the derivatives acting on $\partial \phi^{\alpha}/\partial \lambda,$ we can rewrite this in terms of a term where no derivative acts on $\partial \phi^{\alpha} / \partial \lambda,$ and a term that is a total derivative; this is just like the usual story from classical mechanics, but generalized to arbitrary tensor field theories. We will label these pieces by $E_{\alpha}(x, \lambda)$ and $\partial_{\mu} \theta^{\mu}(x, \lambda)$; we then have
$$\frac{\partial}{\partial \lambda} \mathbf{\mathcal{L}}(x, \lambda) = \left[ E_{\alpha}(x, \lambda) \frac{\partial \phi^{\alpha}}{\partial \lambda}(x, \lambda) + \partial_{\mu} \theta^{\mu}(x, \lambda) \right] \boldsymbol{\epsilon}(x).$$
It is a basic result in the theory of differential forms that $(\partial_{\mu} \theta^{\mu}) \boldsymbol{\epsilon},$ which is an $n$-form, is equivalent to f$d(\theta \cdot \boldsymbol{\epsilon}),$ i.e., the exterior derivative of the (n-1)-form obtained by contracting $\theta^{\mu}$ into the first index of $\boldsymbol{\epsilon}.$ This follows by noting that $d(\theta \cdot \boldsymbol{\epsilon})$ and $\boldsymbol{\epsilon}$ must be proportional, since they are both $n$-forms on an $n$-dimensional manifold, and directly computing the proportionality function as $\partial_{\mu} \theta^{\mu}.$

Given these considerations, we will group $E_{\alpha}$ and $\boldsymbol{\epsilon}$ as a single $n$-form $\mathbf{E}_{\alpha},$ and we will group $\partial_{\mu} \theta^{\mu}$ and $\boldsymbol{\epsilon}$ as an exact $n$-form $d \boldsymbol{\theta}.$ The resulting expression is
$$\frac{\partial}{\partial \lambda} \mathbf{\mathcal{L}}(x, \lambda) = \mathbf{E}_{\alpha}(x, \lambda) \frac{\partial \phi^{\alpha}}{\partial \lambda}(x, \lambda) + d \boldsymbol{\theta}(x, \lambda).$$
We will now see that $\mathbf{E}_{\alpha}$ and $d \boldsymbol{\theta}$ are independent of all of the many coordinate choices we made in this section.
First, if we integrate the above equation over $\mathcal{M}$, then the $\boldsymbol{\theta}$-dependent term disappears assuming appropriate dropoff conditions on the field near any asymptotic boudnary of $\mathcal{M}.$ The left-hand side of the expression is independent of coordinate choices, and $\partial \phi^{\alpha}/ \partial \lambda$ is arbitrary, so $\mathbf{E}_{\alpha}$ must be independent of coordinate choices as well. Then, since $d \boldsymbol{\theta}$ can be expressed as a difference of two coordinate-independent objects, it is also coordinate-independent. The differential form $\mathbf{E}_{\alpha}$ is called the equation of motion of the theory. When it vanishes for a particular field configuration (i.e., at some fixed $\lambda$), the action is stationary — a complete integral of the Lagrangian over all of $\mathcal{M}$ is unchanged at leading order as $\lambda$ is varied.

Now, the field $\boldsymbol{\theta}$ is not independent of coordinate choice; only $d \boldsymbol{\theta}$ is. So there is a whole equivalence class of allowable $\boldsymbol{\theta}$ terms that differ from one another by closed forms, i.e., forms $\boldsymbol{\omega}$ with $d \boldsymbol{\omega} = 0.$ It was shown by Wald in this paper that this ambiguity can be restricted if one demands that $\boldsymbol{\theta}$ be expressed locally in terms of the fields and their derivatives. The only closed forms that can be constructed from local fields are exact, i.e. they satisfy $\boldsymbol{\omega} = d \boldsymbol{\alpha}$ for some $\boldsymbol{\alpha},$ and so the ambiguity in $\boldsymbol{\theta}$ is only up to addition by exact forms.

Finally, we introduce the idea of an infinitesimal symmetry. An infinitesimal symmetry is a flow on the space of field configurations, i.e., a map from each $\phi^{A}(x)$ to a one-parameter family  $\phi^{A}(x) + \lambda \delta \phi^{A}(x),$ with the property that the Lagrangian changes at leading order in $\lambda$ by an exact form that is a local function of the fields. That is, for this particular one-parameter family, we should have
$$\left. \frac{\partial \mathbf{\mathcal{L}}}{\partial \lambda} \right|_{\lambda = 0} = \mathbb{d} \mathbf{W}[\phi^{A}, \partial_{\mu_1} \phi^{A}, \dots].$$
Just like $\boldsymbol{\theta},$ the form $\mathbf{W}$ is ambiguous up to the addition of an exact form.

If we compare this expression for the symmetric variation of the Lagrangian to the general formula for an arbitrary variation, we obtain the identity
$$\mathbf{E}_{\alpha} \delta \phi^{\alpha} + d\boldsymbol{\theta} = d\mathbf{W}.$$
We stress that while $\boldsymbol{\theta}$ is defined for arbitrary variations, we must evaluate it on the symmetric variation corresponding to $\mathbf{W}$ in order for this identity to hold. What we learn from this expression is that, for the particular symmetric variation $\delta \phi^{\alpha},$ the form $\mathbf{E}_{\alpha} \delta \phi^{\alpha}$ must be an exact form $d(\mathbf{W} - \boldsymbol{\theta}).$ We denote this exact form by $d \mathbf{J},$ and call $\mathbf{J}$ the Noether current corresponding to the symmetry. It is defined on any field configuration. Because it is defined to satisfy the equality
$$d\mathbf{J} = \mathbf{E}_{\alpha} \delta \phi^{\alpha},$$
the form $\mathbf{J}$ is closed on any field configuration where the equations of motion are satisfied.

There is a trick that will be very useful in the following section, which is to use $\mathbf{J}$ to study how the action of a field configuration changes under a "smeared symmetry." To construct a smeared symmetry, we start with an infinitesimal symmetry $\phi^{A}(x, \lambda) = \phi^{A}(x) + \lambda \delta \phi^{A}(x)$, fix a smooth function $\rho(x)$ on spacetime, and construct from these two pieces of data the one-parameter family
$$\phi^{A}(x, \lambda) = \phi^{A}(x) + \lambda \rho(x) \delta \phi^{A}(x).$$
Our general identity for Lagrangian variations tells us that the linear-in-$\lambda$ variation in the Lagrangian within this family of field configurations is
$$\delta L = \mathbf{E}_{\alpha}(x) \rho(x) \delta \phi^{\alpha}(x) + d \boldsymbol{\theta}.$$
But we already know that $\mathbf{E}_{\alpha}(x) \delta \phi^{\alpha}(x)$ is the differential of the Noether current, $\mathbf{J}(x).$ So if we integrate this expression to obtain the variation in the action, the $d \boldsymbol{\theta}$ term drops out and we obtain
$$\delta S = \int_{\mathcal{M}} \rho(x) d \mathbf{J}.$$

As a final comment, let me put all of this back in the form you're probably familiar with from flat space QFT. If we fix a metric on our manifold, then we obtain a preferred volume form $\boldsymbol{\epsilon}$, which is an $n$-form. Since $d \mathbf{J}$ is also an $n$-form, the two must be related by
$$d \mathbf{J} = F \boldsymbol{\epsilon}.$$
If we contract the volume form into both sides of the equation, and use the fact that $\epsilon \cdot \epsilon$ is normalized to be $(-1)^{s} n!,$ with $s$ the number of minus signs in the metric signature, then we obtain
$$F = \frac{(-1)^s}{n!} \epsilon \cdot d \mathbf{J}.$$
If you actually write out the exterior derivative in terms of covariant derivatives, and do the index manipulations, you find:
$$\boldsymbol{\epsilon} \cdot d \mathbf{J} = n \nabla_a ( \epsilon^{a a_2 \dots a_n} J_{a_2 \dots a_n}).$$
(If you don't know how to do these manipulations, I recommend Appendix B of Wald's textbook on general relativity.)

So if we define the vector $j^a$ by
$$j^a = \frac{(-1)^s}{(n-1)!} \epsilon^{a a_2 \dots a_n} J_{a_2 \dots a_n},$$

then we have
$$d \mathbf{J} = (\nabla_a j^a) \boldsymbol{\epsilon}.$$
This is a completely equivalent description; we can talk about Noether currents as being vector fields instead of $(n-1)$ forms, so long as we have a preferred volume form around to convert between the two. This is why in Minkowski spacetime, where the volume form is trivial in inertial coordinates, forms are never discussed at all.

2. Ward Identities

Now, suppose we want to quantize a field theory on a manifold $M.$ We have a set of fields $\phi,$ which might include background fields like the metric, and a Lagrangian $n$-form that can be written as
$$\mathbf{\mathcal{L}}[\phi(x), \nabla_{a_1} \phi(x), \dots, \nabla_{a_k} \dots \nabla_{a_1} \phi(x)].$$
(Since we've fixed a metric, I'm now writing everything in terms of the Levi-Civita derivative on tensor fields and using abstract indices; this changes notation from the previous section, where I considered completely general notions of covariant derivative, but only ever did calculations in local patches. I'm also removing all indices from fields for the sake of notational simplicity.)

We will now split our collection of fields into two sets, reserving the name $\phi$ for the fields we want to quantize, and $\gamma$ for background fields like the metric. These background fields will not be included in the path integral measure. For notational convenience, I will also start writing the Lagrangian as $\mathbf{\mathcal{L}}[\phi, \gamma],$ with dependence on derivatives being implicit. According to the path integral formulation of quantum field theory, correlation functions of fields are defined by the formal expression
$$\langle \phi(x_1) \dots \phi(x_n) \rangle = \frac{1}{Z} \int \mathcal{D}\phi \phi(x_1) \dots \phi(x_n) e^{\alpha \int_{\mathcal{M}} \mathbf{\mathcal{L}}[\phi, \gamma]}.$$
The coefficient $\alpha$ appearing in the exponent is $i$ for a Lorentzian field theory, and $(-1)$ for a Euclidean field theory. $Z$ is the standard normalization, corresponding to the path integral with no operator insertions.

We now consider a field transformed under a smeared symmetry $\phi_*(x) = \phi(x) + \lambda \rho(x) \delta \phi(x).$ (Recall from the previous section that the term "smeared symmetry" indicates that if $\rho(x)$ were constant, this would be an actual infinitesimal symmetry of the theory.) The path integral giving correlation functions of $\phi$ can be rewritten under the change of variables $\phi \mapsto \phi_*$ as
$$\langle \phi(x_1) \dots \phi(x_n) \rangle = \frac{1}{Z} \int \mathcal{D}\phi_* \phi_*(x_1) \dots \phi_*(x_n) e^{\alpha \int_{\mathcal{M}} \mathbf{\mathcal{L}}[\phi_*, \gamma]}.$$
As a purely formal expression, the path integral measure $\mathcal{D}\phi_*$ can be written to first order in $\lambda$ as $(1 + \lambda A_{\rho}[\phi]) \mathcal{D} \phi.$ The term $A_{\rho}[\phi],$ which is a functional of the field and may depend on $\rho(x),$ is an anomaly. A symmetry for which it vanishes, for all $\rho(x)$, is called a non-anomalous symmetry. And we know, from the previous section, that there is a Noether current $\mathbf{J}$ associated to the symmetry for which the first-order-in-$\lambda$ change in the action is $\int_{\mathcal{M}} \rho(x) d \mathbf{J}(x).$

The correlation functions of $\phi$ are $\lambda$-independent, so the path integral over $\phi_*$ must be $\lambda$-independent as well. If we collect all order-$\lambda$ terms, and set them equal to zero, we obtain the expression
$$\begin{split}0 & = \frac{1}{Z} \sum_{k=1}^{n} \int \mathcal{D}\phi \phi(x_1) \dots \rho(x_k) \delta \phi(x_k) \dots \phi(x_n) e^{\alpha \int_{\mathcal{M}} \mathbf{\mathcal{L}}[\phi, \gamma]} \\ & + \frac{1}{Z} \int \mathcal{D}\phi \phi(x_1) \dots \phi(x_n) e^{\alpha \int_{\mathcal{M}} \mathbf{\mathcal{L}}[\phi, \gamma]} \int_{\mathcal{M}} \alpha \rho(x) d\mathbf{J}(x) \\ & + \frac{1}{Z} \int \mathcal{D}\phi A_{\rho}[\phi] \phi(x_1) \dots \phi(x_n) e^{\alpha \int_{\mathcal{M}} \mathbf{\mathcal{L}}[\phi, \gamma]}.\end{split}$$
We can rewrite the first term as
$$\int_{\mathcal{M}} \rho(x) \sum_{k=1}^{n} \delta(x - x_k) \langle \phi(x_1) \dots \delta \phi(x_k) \dots \phi(x_n) \rangle,$$
the second term as
$$\alpha \int_{\mathcal{M}} \rho(x) \langle d\mathbf{J}(x) \phi(x_1) \dots \phi(x_n) \rangle,$$
and the third term as
$$\langle A_{\rho}[\phi] \phi(x_1) \dots \phi(x_n) \rangle.$$
If the symmetry is non-anomalous, meaning we set $A_{\rho} = 0$, then we obtain
$$\int_{\mathcal{M}} \rho(x) \sum_{k=1}^{n} \delta(x - x_k) \langle \phi(x_1) \dots \delta \phi(x_k) \dots \phi(x_n) \rangle = - \alpha \int_{\mathcal{M}} \rho(x) \langle d\mathbf{J}(x) \phi(x_1) \dots \phi(x_n) \rangle.$$
If this is to hold for arbitrary smearings $\rho(x)$, then we must have the local identity
$$\sum_{k=1}^{n} \delta(x - x_k) \langle \phi(x_1) \dots \delta \phi(x_k) \dots \phi(x_n) \rangle = - \alpha \langle d\mathbf{J}(x) \phi(x_1) \dots \phi(x_n) \rangle.$$
This is the Ward identity. For $\alpha=-1,$ which is Euclidean signature, it is identical to the Ward identity quoted in the introduction to this post.

One brief comment, before we move on to the final section on commutators: as mentioned in the introduction, the Ward identity implies that $d \mathbf{J}(x)$ vanishes as an operator in correlation functions where no other operators are inserted at $x.$ I said in the introduction that I found this fact unsurprising when I first learned quantum field theory; after all, it seems like a classically conserved current ought to be quantumly conserved. But in reality, that statement is highly nontrivial! The classical current is conserved only on field configurations satisfying the equation of motion. But when we quantize using the path integral, we integrate over all field configurations, regardless of whether or not they satisfy the classical equations of motion. The fact that the current is quantumly conserved is actually quite remarkable.

3. Commutators of Charges

In the previous section, we discussed correlation functions of quantum field theories on a background $\mathcal{M},$ defined via the path integral. At the moment, these are purely abstract functions of the fields. In order to interpret them as expectation values of operators in Hilbert space, it is necessary to introduce a foliation of $\mathcal{M}$ by surfaces $\Sigma_{t},$ with $t$ parametrizing the foliation. This foliation gives us an associated Hilbert space, $\mathcal{H},$ which we usually think of as living "on each leaf of the foliation." It also gives us an operator $H(t)$ on Hilbert space, the Hamiltonian, that generates a "time evolution" on the foliation that tells us how Hilbert space states should evolve when we move between leaves. (Remark: Not all manifolds admit such foliations! But e.g. Minkowski spacetime and Euclidean spacetime do. More generally, in Lorentz signature, this is possible in an arbitrary globally hyperbolic manifold; that's the setting for most canonical approaches to quantizing QFT in curved spacetimes.)

Now fix a particular leaf of the foliation, $\Sigma,$ which for convenience we will label by $t=0.$ (Usually when quantizing a theory in flat spacetime this will be a spacelike hyperplane, but not always; for example, in radial quantization of conformal field theories, $\mathbb{R}^n$ is foliated by spheres centered at the origin, and $t=0$ is usually taken to be the sphere of unit radius.) By doing the path integral only over the set
$$\cup_{t < 0} \Sigma_t,$$
and leaving the boundary conditions at $\Sigma$ open, we generate a state of the quantum field theory on $\mathcal{H}.$ If we insert operators in the region $t < 0$, this changes the state prepared by the path integral. It is a standard assumption in non-gauge theories that the set of all states generated by local operator insertions has a dense span in Hilbert space, i.e., that we can approximate any state arbitrarily well by a sum of states prepared by a path integral.

Now comes the fun part. Suppose we have an infinitesimal symmetry of the quantum fields with associated Noether current $\mathbf{J},$ and Ward identity
$$- \alpha \langle d \mathbf{J}(x) \phi(x_1) \dots \phi(x_n) \rangle = \sum_{k=1}^{n} \delta(x-x_k) \delta_k \langle \phi(x_1) \dots \phi(x_n) \rangle,$$
where $\alpha$ is $i$ in Minkowski signature and $(-1)$ in Euclidean signature. Suppose I then integrate this expression in $x$ over a small neighborhood of $\Sigma$; for concreteness, let's integrate over the region bounded by $\Sigma_{\epsilon}$ and $\Sigma_{-\epsilon}.$ Using Stokes' theorem, and introducing the notation $Q(\epsilon) = \int_{\Sigma(\epsilon)} \mathbf{J},$ I obtain the expression
$$- \alpha \langle Q(\epsilon) \phi(x_1) \dots \phi(x_n) \rangle + \alpha \langle Q(-\epsilon) \phi(x_1) \dots \phi(x_n) \rangle = \sum_{x_k \in [-\epsilon, \epsilon]} \delta_k \langle \phi(x_1) \dots \phi(x_n) \rangle,$$
So far, these expressions are in terms of abstract correlation functions, not in terms of Hilbert space quantities. In Hilbert space terms, the correlation functions are expressed as "time-ordered" vacuum expectation values. The correlator $\langle O_{1}(x_1) \dots O_n(x_n) \rangle$ is expressed up to an overall normalization factor as
$$\langle 0 | O_{j_1}(x_{j_1}) U(t_{j_1}, t_{j_2}) O_{j_2}(x_{j_2}) \dots U(t_{j_{n-1}}, t_{j_n}) O_{j_n}(x_n) |0 \rangle,$$
where $|0\rangle$ is the vacuum state, where the operators have been reordered in order of decreasing time, and where $U(t, t')$ is the time evolution operator from $t'$ to $t$.

In this language, the correlator $\langle Q(\epsilon) \phi(x_1) \dots \phi(x_n) \rangle$ becomes, in the limit $\epsilon \rightarrow 0,$
$$\langle \text{state prepared by t>0}| Q(0) \prod_{x_j \in \Sigma} \phi(x_k) | \text{state prepared by t<0}\rangle,$$
and $\langle Q(-\epsilon) \phi(x_1) \dots \phi(x_n) \rangle$ becomes
$$\langle \text{state prepared by t>0}| \prod_{x_j \in \Sigma} \phi(x_k) Q(0) | \text{state prepared by t<0}\rangle,$$
So our integrated Ward identity, quite beautifully, becomes
$$- \alpha \langle t>0| [Q(0), \prod_{x_j \in \Sigma} \phi(x_j)] | t<0 \rangle = \sum_{x_k \in \Sigma} \delta_k \langle t>0 | \prod_{x_j \in \Sigma} \phi(x_j) | t<0 \rangle.$$
Assuming the states prepared by the path integral are dense in the Hilbert space, this equality of matrix entries implies an equality of operators:
$$- \alpha [Q(0), \prod_{x_j \in \Sigma} \phi(x_j)]= \sum_{x_k \in \Sigma} \delta_k \prod_{x_j \in \Sigma} \phi(x_j).$$
In particular, by inserting only a single operator on $\Sigma$, we can obtain the identity
$$- \alpha [Q(0), \phi(x)]= \delta \phi(x).$$
This is a wonderful identity. It tells us that $Q(0)$, which is a Hilbert space operator constructed by integrating the Noether current over $\Sigma$, is the operator that generates local transformations of fields inserted on $\Sigma,$ via the commutator.

As a final comment, let me make contact with another common way that people write commutators of fields with conserved charges. Equation 2.163 of the yellow pages, for example, lists the Euclidean charge-commutator identity as
$$[Q, \phi(x)] = - i G \phi(x).$$
But because they have defined $G$ in equation 2.153 by
$$- i G \phi(x) = \delta \phi(x),$$
this matches my expression with $\alpha = -1,$ which is the right convention for Euclidean signature.

Wigner's theorem

Wigner's theorem is one of the most fundamental results in quantum theory, but I somehow didn't hear of it for the first time until the third year of my PhD. Even then, it took me another year or so to fully appreciate the theorem's importance. I suspect this experience is common — Wigner's theorem is thought of as being fairly technical or mathematical, and doesn't get covered in most quantum mechanics courses. But because it's so essential, I'd like to dedicate a post to explaining and proving it. The statement of the theorem is simple: every symmetry of a quantum system can be represented as a unitary or anti-unitary operator on Hilbert space . (Here we are implicitly thinking about quantum states as vectors in a Hilbert space — there are ways of thinking about Wigner's theorem from a more operator-algebraic point of view, but the Hilbert space picture is a good place to start.) The reason Wigner's theorem is so valuable is that if we believe a sy

Projective representations, central extensions, and covering groups

In my post on Wigner's theorem , I explained the famous result that any symmetry transformation on quantum states can be realized as a unitary or antiunitary operator on Hilbert space. But when we study symmetries of quantum systems, we usually have in mind not a single symmetry but a full group of symmetries; Wigner's theorem tells us nothing about how the operators corresponding to different symmetries in the same group should compose with one another. Suppose, for example, that a quantum system transforms under the symmetry group $G$, and that the unitary or antiunitary operator corresponding to the element $g \in G$ is denoted by $\hat{U}_g$. Because any two unitary operators related by a phase are physically equivalent, it may not be the case that $\hat{U}_{g_1}$ and $\hat{U}_{g_2}$ compose to $\hat{U}_{g_1 g_2}$; instead, we will have a relationship like $$\hat{U}_{g_1} \hat{U}_{g_2} = e^{i \phi(g_1, g_2)} \hat{U}_{g_1 g_2}.$$ At first glance, it seems like we just made

Statement of purpose

Right now I'm a fourth year PhD student in theoretical physics, working at the interface of quantum information and quantum gravity. Many of the subjects I end up learning for my research lack good introductory references. The physics subjects are often explained in research papers that were written decades ago in now-outdated notation and terminology; the math subjects are explained in textbooks for mathematicians that mostly lack physical intuition. For aspiring physicists like me, it can be helpful to have concepts that are well-understood by experts re-interpreted and re-explained in concise, pedagogical terms. While learning new math and physics subjects for my research, I often end up writing detailed "explainers" for myself that I think fit this niche. This blog will serve mostly as a repository for these explainers. I'll post explainers here as I write them in the hopes that they might be useful to other researchers trying to penetrate formidable subjects. At