Skip to main content

Measure theory and Lp Spaces

I'm on a quest to learn about operator algebras in the hopes of understanding the many interesting ways they have been applied to quantum field theory. This note is the first in a series that will build the essential aspects of the theory from the ground up. I will not prove everything, giving references for the proofs of many lemmas, but I will try to give enough detail that the mathematical underpinnings of the theory are clear.

This post is about Lp spaces. These are Banach spaces of functions — with the L space actually being a Banach algebra — that show up frequently in the theory of general operator algebras. I will introduce the basics of measure theory and the general theory of Lebesgue integration, introduce the Lp spaces, and prove some basic theorems concerning them.

The outline is:

  1. In section 1, I will introduce the basic tools of measure theory: σ-algebras, measurable functions, and measures.
  2. In section 2, I will discuss some elementary properties of real- and complex-valued measurable functions.
  3. In section 3, I will define the Lebesgue integral on a general measure space and state some important theorems such as the dominated convergence theorem.
  4. In section 4, I will make some comments on the role of sets of measure zero, and how to think about them.
  5. In section 5, I will introduce the Lp spaces and prove that they are Banach spaces. I will also show that L2 is a Hilbert space.
I learned most of this material from Walter Rudin's textbook "Real and Complex Analysis," with a few points being picked up later by reading Ronald Douglas's textbook "Banach Algebra Techniques in Operator Theory."

Prerequisites: Basics of topology and real analysis.

Table of Contents

  1. σ-algebras, measurability and measures
  2. Real and complex measurable functions
  3. Integration
  4. Sets of measure zero
  5. Lp spaces

1. σ-algebras, measurability, and measures

Given a set X, a measure is a way of assigning volume to certain subsets of X. It can be confusing, on first approaching measure theory, to understand why only some subsets of X should be considered measurable. But when we think about how crazy sets can be — completely arbitrary collections of points in X — it shouldn't be so surprising that a consistent theory of measure might only allow us to measure a restricted family of subsets.

So let Σ be some collection of subsets of X that we will declare to be "measurable." It will turn out that not any collection Σ can reasonably be called measurable; we will discover by investigation what properties Σ ought to have. We begin by considering a function μ:Σ[0,], which we will call a measure, and which we will think of as assigning volumes to elements of E.

There are two obvious properties that μ should have: (i) it should assign measure zero to the empty set, μ()=0, and (ii) if E1,E2Σ are disjoint, then we should have μ(E1E2)=μ(E1)μ(E2). In fact, since we allow the measure of a set to be infinite, we might as well extend property (ii) to sequences of disjoint sets: if {En} is a sequence in Σ of pairwise-disjoint sets, then we should have μ(jEj)=jμ(Ej).

In order for the preceding paragraph to make sense, we need to be contained in Σ, and we need Σ to be closed under countable unions. Another condition that makes sense to impose is that μ should be able to assign a volume to the full set X, whether that volume be finite or infinite; so we should require XΣ. Finally, if we are able to measure the sum of the volume of disjoint sets — μ(E1)+μ(E2)=μ(E1E2) — then we really ought to be able to measure the difference in volume of nested sets. That is, if E1E2 and E1,E2Σ, then we ought to have μ(E2E1)=μ(E2)μ(E1). So we should impose E2E1Σ.

To meet all the criteria of the preceding paragraph, it is sufficient to require that Σ (i) contains the empty set, (ii) is closed under complements, and (iii) is closed under countable unions. Any collection of subsets of X satisfying these three properties is called a σ-algebra on X. Once a σ-algebra has been specified, X is said to be a measurable space and the sets in the σ-algebra are said to be measurable sets. Standard manipulations in set theory show that conditions (i-iii) imply that Σ contains X=c, is closed under countable intersections jEj=(jEjc)c, and is closed under set subtractions A,BΣAB=ABc.

One can easily show that the intersection of σ-algebras is a σ-algebra, which lets us define the σ-algebra "generated by" any collection of subsets of X as the intersection of all σ-algebras containing that collection. When X is a topological space, it is often convenient to require that all open sets be measurable; the Borel algebra on X is the σ-algebra generated by open subsets of X.

A map from a measurable space X into a topological space Y is said to be a measurable function if preimages of open sets in Y are measurable in X; this mimics the definition of a continuous function on a topological space, for which preimages of open sets are open. Note that if X is a topological space, then all continuous maps from X to Y are measurable with respect to the Borel algebra.

We are now ready to define a measure properly. Given a measurable space X with σ-algebra Σ, a measure is a map μ:Σ[0,] satisfying μ()=0 and μ(jEj)=jμ(Ej) for pairwise-disjoint measurable sets Ej. From this definition, many interesting properties can be proved; the most important of these, for the moment, is that measures are monotonic: for AB both measurable, we have μ(B)=μ(A)+μ(BA)μ(A).

A measurable space on which a measure has been defined is called a measure space.

2. Real and complex measurable functions

Let X be a measure space with σ-algebra Σ and measure μ. We will consider maps from X into R or C. I won't prove the basic properties of these maps, but I will state them below. (The proofs aren't very instructive, but if you're curious they can be found in chapter 1 of Rudin's book.) They are:
  • If u:XR and v:XR are measurable, then f(x)=u(x)+iv(x) is measurable.
  • If f:XC is measurable, then its real part, its complex part, and its magnitude are all measurable.
  • If f and g are measurable functions from X to R or C, then fg and f+g are measurable.
  • The pointwise supremum and infimum (and hence limit-superior and limit-inferior) of any sequence of measurable functions are measurable.
This last point implies that for any measurable function f:XR, the positive and negative parts f+=max{f,0} and f=min{f,0} are measurable. The decomposition of f into positive and negative parts will play an important part in defining integration in the next section.

For any set E in X, we define the characteristic function χE to be the function from X to R that takes the value 1 on E and 0 on Ec. Clearly E is a measurable set if and only if χE is a measurable function.

3. Integration

Given a measure space X, a simple function is a measurable function whose range has only finitely many values. Any such function can be written as a finite sum of characteristic functions over disjoint measurable sets Ej:
s(x)=j=1nαjχEj(x).
The utility of simple functions is that they can be used to approximate any measurable function f "from below" in the sense that we can find a sequence |s1||s2||f| for which the sequence sn converges pointwise to f. We will prove this now for f:X[0,], but the same conclusion holds for general f:XC by approximating the real-positive, real-negative, imaginary-positive, and imaginary-negative parts of f separately.

For each n, we will divide the positive real axis [0,] into a finite number of layers; some collection of "small" layers that cover all of the real axis up to n, and then one big layer that covers the remaining portion of the real axis from n all the way to infinity. We want the "small" layers to decrease in width with increasing n. To accomplish this, we will divide the interval [0,n] into n2n segments each of which has length 2n. For any xX, we then approximate f(x) by the bottom of the layer it lies in, defining the function sn by
sn(x)=maxk=0,,n2n{k2n|k2nf(x)}.
The function sn takes on finitely many values, each of which can be expressed in terms of sets of the form f1([α,]). These sets are measurable, so we conclude that sn is a simple function. It is straightforward to check the properties s1f and snf.

It is easy to define what we mean by the integral of a simple function with respect to the measure μ: it ought to just be
Xdμ(jαjχEj)=jαjμ(Ej).
Because simple functions can be used to approximate any measurable function from below, it makes sense to define the integral of a general measurable function f:X[0,] by approximation:
Xdμf=sup{Xdμs|sf,s simple}.
We define integration over a measurable subset A by
Adμf=XdμχAf.

It is straightforward to check from this definition that the integral of a measurable function has nice, standard properties:
  1. 0fg implies XdμfXdμg.
  2. AB and f0 implies AdμfBdμf.
  3. c[0,) and f0 implies Xdμcf=cXdμf.
  4. μ(A)=0 or f|A=0 both imply Xdμf=0.
A few important properties of integration are slightly less trivial to show; again, proofs can be found in chapter 1 of Rudin's book. They are:
  1. Integration is linear: f,g0 implies Xdμ(f+g)=Xdμf+Xdμg.
  2. If f0 is a measurable function on X, then the map EEdμf is a measure on X.
So far we have only defined integrals of positive measurable functions. It is easier to do this than in the general case, for the same reason that sums of positive numbers are always insensitive to rearrangement, while general sums can converge to different values if they are rearranged. A general sum can be rearranged only if it converges absolutely; this inspires us to give a notion of what it means for a function to be absolutely integrable.

If f:XC is measurable, then we say f is integrable if Xdμ|f| is finite. The space of all such functions is denoted L1(μ). The real and complex parts of an integrable function are also integrable, as are the positive and negative parts of those. This lets us define the integral of a general function in L1(μ):
Xdμf=XdμRe(f)+XdμRe(f)+iXdμIm(f)+iXdμIm(f).
This definition of complex integration is easily checked to be complex-linear using real-linearity of the positive-real case.

Complex integration also satisfies a version of the triangle inequality. If f is in L1(μ), then we can write its integral in polar form as Xdμf=reiθ. We then have
|Xdμf|=|Xdμfeiθ|=XdμRe(feiθ)Xdμ|f|.
In the second step we have used that we know the integral is real, so we can replace the integrand by its real part. In the third step, we have used that the real part of a number is upper bounded by its norm.

A final important tool in the study of integration over measures is the dominated convergence theorem. I won't prove it; the proof isn't hard, but it requires a few lemmas that I don't think are particularly instructive on their own. The statement is that if fn is a sequence of complex, measurable functions on X that converge pointwise to a function f, and that satisfy |fn|g for some gL1(μ), then fn and f are also in L1(μ) and the integral of the sequence converges:
XdμfnXdμf.

4. Sets of measure zero

One of the really interesting things about measure spaces is that in integration, sets of measure zero just don't matter at all. If E is a set of measure zero, and f and g are measurable functions on X that agree away from E, then we have
Xdμf=Ef+XEdμf=0+XEdμf=Eg+XEdμg=Xdμg.
In fact, the converse is true as well: if Xdμf=Xdμg, then f and g must agree "almost everywhere," i.e., there must be a set of measure zero E away from which they agree.  As usual, it suffices to show this when fg is nonnegative, and to get the general complex case by looking at real/imaginary/positive/negative parts. In the case fg0, we define the measurable sets
En={x|f(x)>g(x)+1/n}.
We have
μ(En)nEndμ(fg)Xdμ(fg)=0.
This implies that each En has measure zero, so their union, which is the set {x|f(x)g(x)}, also has measure zero.

These observations inspire us to define equivalence classes on L1(μ), where we say two integrable functions f and g are equivalent if they differ by a function that vanishes almost everywhere. The space of equivalence classes is called L1(μ); a class [f]L1(μ) has a well defined integral, and by abuse of notation elements of L1(μ) are treated as functions, rather than equivalence classes, since for most purposes the behavior of a function on a set of measure zero is completely irrelevant.

5. Lp spaces

A generalization of L1(μ) now presents itself to us. For f:XC measurable and p1, we define the p-norm of f by
fp=(Xdμ|f|p)1/p.
The term "p-norm" is slightly inaccurate, as p isn't actually a norm; it isn't positive definite, since it assigns zero to all functions that vanish almost everywhere. We define Lp(μ) to be the set of functions on which the p-norm is finite, and define Lp(μ) to be the quotient of Lp(μ) by the space of functions that vanish almost everywhere. We will see that the p-norm is an actual norm on Lp(μ). To show this, it suffices to show that the p-norm is a seminorm on Lp(μ), i.e., that it satisfies all the properties of a norm except for positive definiteness.

Before proceeding to show that the p-norm is a norm on Lp(μ), we will define a p-norm in the limit p. We define the essential supremum of a measurable function f:X[0,] to be the smallest number a such that the set {x|f(x)>a} has measure zero. Formally, we write
esssup(f)=inf{aR|μ(f1((a,]))=0}.
The -norm of a complex, measurable function is the essential supremum of its absolute value. We define L(μ) to be the space of measurable functions with finite -norm, and L(μ) to be the quotient by the set of functions that vanish almost everywhere.

Now, we will observe that all of the p-norms, including the -norm, satisfy the triangle inequality f+gpfp+gp and the absolute homogeneity condition αfp=|α|fp. Let's start with the absolute homogeneity condition; for p<, this follows trivially from the definition due to linearity of integration. For p= and α=0, the conclusion is obvious. For p= and α0, we have
{x||αf(x)|>a}={x||f(x)|>a/|α|},
and thus
αf=inf{a|μ({x||f(x)|>a/|α|})=0}.=|α|inf{a|μ({x||f(x)|>a})=0}=|α|f.

The triangle inequality is a little harder. The proof, while not difficult, uses some lemmas about convex functions that are beyond the scope of the present post. I'll omit it here; the proof can be found on the Wikipedia page for the Minkowski inequality. Another important inequality is Holder's inequality, which says that if p and q are numbers greater than or equal to one satisfying 1/p+1/q=1 (and we say that p=1 and q= satisfy this relationship), then for fLp(μ) and gLq(μ) we have
fg1fpgq.

One important thing to know about Lp spaces is that they are Banach spaces; that is, Lp(μ) is complete in the p-norm. To see this for finite p, let fnLp(μ) be a Cauchy sequence. Using standard properties of Cauchy sequences, we can define a subsequence fnj satisfying
fnj+1fnj12j.
Using the triangle inequality for the p-norm, we have
j=1k|fnj+1fnj|pj=1kfnj+1fnjp.
The right-hand side converges in the limit k, so the limit g=j=1|fnj+1fnj| must converge to a function gLp(μ). For this to be true, g must be finite almost everywhere. As such, the limit
f=limkfnk+1=fn1+j=1(fnj+1fnj)
exists almost everywhere in Lp(μ), and thus exists exactly in Lp(μ). So the Cauchy sequence {fn} has a convergent subsequence in Lp(μ), which means it converges in Lp(μ).  This shows that Lp(μ) is complete.

In the case that p is infinite, let fnL(μ) be a Cauchy sequence, and let E be the complement of the union of all sets of the form
{x||fn(x)fm(x)|>fnfm}.
There are countably many such sets, and they all have measure zero, so Ec has measure zero. On E, we have
|fn(x)fm(x)|fnfm,
so fn is uniformly Cauchy on E and therefore converges uniformly to some function f. This means that for any ϵ, there exists some integer N for which nN implies |fnf|<ϵ on E. So the set
{x||fn(x)f(x)|>ϵ}
is contained in Ec for all nN. This set must have measure zero (since it lies within a set of measure zero), which implies
fnfϵ.
Taking limits implies that fn converges to f in L(μ).

The last thing we will show is that L2(μ) is not just a Banach space, but a Hilbert space. For f,gL2(μ), we define the inner product
f|g=Xdμf¯g.
We are implicitly treating f and g as functions, rather than equivalence classes of functions, but this is fine; the integral in the above equation does not change if either f or g is changed on a set of measure zero. The only important thing is to check that the integral exists in the first place; but this follows from Holder's inequality, which gives f¯g1f¯2g2<. This inner product clearly induces the 2-norm, and is linear in g, and is positive definite. The only thing we need to check is that it satisfies f|g=g|f¯, but this follows readily from considering real and imaginary parts of integrals.


Comments

Popular posts from this blog

Pick functions and operator monotones

Any time you can order mathematical objects, it is productive to ask what operations preserve the ordering. For example, real numbers have a natural ordering, and we have xyxkyk for any odd natural number k. If we further impose the assumption y0, then order preservation holds for k any positive real number. Self-adjoint operators on a Hilbert space have a natural (partial) order as well. We write A0 for a self-adjoint operator A if we have ψ|A|ψ0 for every vector |ψ, and we write AB for self-adjoint operators A and B if we have (AB)0. Curiously, many operations that are monotonic for real numbers are not monotonic for matrices. For example, the matrices P=12(1111) and Q=(0001) are both self-adjoint and positive, so we have P+QP0, but a str...

Envelopes of holomorphy and the timelike tube theorem

Complex analysis, as we usually learn it, is the study of differentiable functions from C to C. These functions have many nice properties: if they are differentiable even once then they are infinitely differentiable; in fact they are analytic, meaning they can be represented in the vicinity of any point as an absolutely convergent power series; moreover at any point z0, the power series has radius of convergence equal to the radius of the biggest disc centered at z0 which can be embedded in the domain of the function. The same basic properties hold for differentiable functions in higher complex dimensions. If Ω is a domain --- i.e., a connected open set --- in Cn, and f:ΩCn is once differentiable, then it is in fact analytic, and can be represented as a power series in a neighborhood of any point z, i.e., we have an expression like f(z)=ak1kn(z1z)k1(znz)kn. The ...

Some recent talks (Summer 2024)

My posting frequency has decreased since grad school, since while I'm spending about as much time learning as I always have, much more of my pedagogy these days ends up in papers. But I've given a few pedagogically-oriented talks recently that may be of interest to the people who read this blog. I gave a mini-course on "the algebraic approach" at Bootstrap 2024. The lecture notes can be found here , and videos are available here . The first lecture covers the basic tools of algebraic quantum field theory; the second describes the Faulkner-Leigh-Parrikar-Wang argument for the averaged null energy condition in Minkowski spacetime; the third describes recent developments on the entropy of semiclassical black holes, including my recent paper with Chris Akers . Before the paper with Chris was finished, I gave a general overview of the "crossed product" approach to black hole entropy at KITP. The video is available here . The first part of the talk goes back in ti...