Loading [MathJax]/jax/output/CommonHTML/jax.js
Skip to main content

Hamiltonian simulation via the Trotter-Suzuki decomposition

This academic term, some colleagues at Stanford and I are running a journal club on Hamiltonian simulation — the problem of how to use a quantum computer to simulate the time evolution of a physical system. Hamiltonian simulation is a hot topic in research, in part because it's believed that simulating certain systems on quantum computers will allow us to probe aspects of those systems that we don't know how to access with traditional laboratory experiments.

The earliest approach to this problem, and one that is still practically useful for certain applications, makes use of the Trotter decomposition and its generalization the Trotter-Suzuki decomposition. These are algorithms for decomposing a time evolution operator that acts simultaneously on the entire quantum system, into a sequence of time evolution operators that act locally on only a few physical sites at a time. Specifically, given a time-independent Hamiltonian H=jhj, we would like to find a way to approximate the time evolution operator eiHt by some product
Mμ=1eitμhcμ.
That is, we should be able to make some list {hcμ}Mμ=1 of local terms appearing in the Hamiltonian, evolve with each one for a time tμ, and in doing so approximate the full time evolution eiHt.

By approximate, we mean that the two operators should be close in the operator norm. The operator norm A is given by the largest singular value of A, which can be realized as
A2=max|ψ0ψ|AA|ψψ|ψ.
We say that a Trotter-Suzuki decomposition of the form (1) approximates eiHt with error at most ϵ if we can prove
Mμ=1eitμhcμeiHtϵ.

Quantum computers are designed to implement local unitaries on a few qubits at a time. The Trotter-Suzuki decomposition, because it lets us approximate a true Hamiltonian time evolution by a sequence of local unitaries, is an effective approach to simulating Hamiltonians on quantum computers.

In section 1 of this post, I review the Trotter decomposition, which appeared in this paper and is the "first-order" version of a decomposition later improved by Suzuki. In section 2, I review the Suzuki decomposition as presented in this paper. I also comment on the coefficients of higher-order error terms in the Trotter-Suzuki decomposition, and the limitations they place on the practical effectiveness of the Suzuki formula.

Thanks to Shaun Datta and Ani Krishna for spotting errors in this post.

Prerequisites: Only elementary quantum mechanics. Understanding the circuit model of quantum computation is probably helpful for understanding the motivation.

Table of Contents

  1. The Trotter decomposition
  2. The Suzuki decomposition

1. The Trotter decomposition

Suppose we have a Hamiltonian H expressible as a sum of local terms,
H=jhj,
with corresponding time evolution operator
U(t)=eitjhj.
If t were small, then we could imagine approximating this operator using Taylor expansions of the exponential. But we can't put that assumption in by hand, because generally we will want to be able evolve quantum systems for long times. So the trick Trotter came up with is to subdivide the time evolution into N time steps, each of length δ=t/N, and consider the limit as N becomes large. We can write the time evolution operator as
U(t)=(eiδjhj)N.
The Baker-Campbell Hausdorff formula tells us that the difference between eiδjhj and jeiδhj can be written as a power series in δ, with the first term being of order δ2. So, naively, we can write
U(t)=(jeiδhj+O(δ2))N=(jeiδhj)N+O(Nδ2).
This looks pretty good! It tells us that the operator norm difference between U(t) and (eiδhj)N scales like Nδ2. So if we want the total error to be ϵ, then we must choose N to satisfy
ϵt2N,
or
Nt2ϵ.
So the Trotter formula gives us a way of approximating time evolution to accuracy ϵ, when ϵ/t is small, by applying a number of local gates that scales like t2/ϵ.

Of course, there are parameters other than ϵ/t that might be relevant in a practical computation of the error. We might want to know how the error scales as we increase the total number of local terms in the Hamiltonian, or as we vary the operator norms of the local terms in the Hamiltonian. These are interesting questions, but beyond the scope of the present post. In order to address them, you'd need to use the Zassenhaus formula to bound higher-order terms in the series relating eitjhj to jeithj.

2. The Suzuki decomposition

Suzuki's idea for approximating a time evolution operator by a sequence of local gates starts from the same place as the Trotter formula: let's write
U(t)=eitjhj=(eiδjhj)N,
and then approximate eiδjhj for small δ by a sequence of local operators,
eiδjhj=eiδt1hc1eiδtMhcM+O(δk),
where k2 is an integer that we'd like to make as large as possible to get the best possible approximation.

The Trotter formula tells us we can set hcj=hj and tj=1 in equation (2) and get an error term of order δ2. But we can certainly imagine there being longer strings of local operators hcμ and times tμ that give us even better approximations! A priori, the only restriction on the operators hcμ and the times tμ is that the  order-δ terms on each side of equation (2) should match, which gives us
jhj=μtμhcμ.
If the local terms in the Hamiltonian are all linearly independent, then this is just the statement that on the right-hand side of (2), each local term hj should be applied for total time δ. You obtain the Trotter formula by doing the whole δ-length time evolution at once for each local term in sequence.

We can think of equation (3) as the unique restriction on equation (2) that constraints the error term to be order δ2 or higher. We could try to continue this way of thinking, and ask, "what restrictions are placed on tμ and hcμ by requiring the error term to be order δ3 or higher?" If we were to match the order-δ2 terms on both sides of equation (2) and simplify, we'd obtain the formula
(jhj)2=μνtμtνhcμhcν+(μtμhcμ)2.
Thanks to equation (3) we know the left side of the equation cancels the second term on the right side, and we obtain the restriction
μνtμtνhcμhcν=0.
This restriction isn't so bad. But already it doesn't have as nice of a physical interpretation as does (3), and we can see that as we go to higher and higher order we're going to produce more and more nontrivial restrictions on the operators hcμ and times tμ.

Suzuki's idea was to use a simple ansatz for a string of exponentials that might work to approximate eiδjhj up to order δk, then to write equation (2) as a constraint on the parameters in that ansatz. The ansatz is recursive. To start, assume there is an operator Qk1(δ) satisfying
eiδjhj=Qk1(δ)+O(δk).
We will construct an ansatz for the next-order approximation Qk by  applying Qk1(δ) repeatedly for some sequence of times p1δ,,pΩδ, i.e., we will write
Qk(δ)=Ωα=1Qk1(pαδ).
We now write
eiδjhj=Qk(δ)+O(δk+1),
and ask whether there exist numbers {pα} so that this equation holds.
 
As a first observation, we note that for k2 the operators Qk1(δ) and Qk(δ) must agree at linear order in δ, which implies the restriction αpα=1. We then write
eiδjhj=αeiδpαjhj,
and substitute in equation (5). When we substitute, it will be helpful to write out the order-δk term explicitly, so we have
eiδpαjhj=Qk1(pαδ)+pkαδkA+O(δk+1).
After substitution, we obtain
eiδjhj=α(Qk1(pαδ)+pkαδkA+O(δk+1)).
Expanding out the product to leading order, we obtain
eiδjhj=Qk(δ)+αpkαδkA+O(δk+1).
So if we fix the coefficients pα so that they satisfy αpkα=0, then all lingering terms at order δk sum to zero, and we are left with equation (6).

This is pretty good, but unfortunately it appears we have a little bit of a problem. For odd k, we can certainly find coefficients pα satisfying αpα=1 and αpkα=0. For example, the coefficients
p=p1=p2==pΩ1,pΩ=1(Ω1)p
satisfy αpα=1, and the equation αpkα=0 becomes
(Ω1)pk+(1(Ω1)p)k=0,
which has a real solution p for any odd k and Ω>1. (This is because solutions to polynomials with real coefficients come in complex conjugate pairs, so if a polynomial has an odd number of roots then at least one of them must be real.) However, for even k, the only solution to αpkα=0 is p1==pΩ=0, which does not satisfy the other constraint αpα=1.

So it appears, for the moment at least, that Suzuki's ansatz has hit a critical problem. It simply cannot be applied to produce an operator Qk for even k. But there is a solution! What if, for odd k, we were able to show
eiδjhj=Qk(δ)+O(δk+2)?
Then we could define the even term Qk+1 as equal to Qk, and define the next odd term Qk+2 from Qk using Suzuki's ansatz. 

In particular, Suzuki shows that equation (7) holds if Qk has the symmetry property Qk(δ)Qk(δ)=1. The proof is simple: we write the order-δk+1 term of the time evolution operator explicitly as
eiδjhj=Qk(δ)+Aδk+1+O(δk+2),
then expand the identity 1=eiδjhjeiδjhj in δ to obtain
1=Qk(δ)Aδk+1+AQk(δ)(δ)k+1+O(δk+2).
The order-δk+1 term on the right side of this equation must vanish. Since k is odd, we have (δ)k+1=δk+1. This fact, combined with the expansion Qk(δ)=1+O(δ), gives us the vanishing of the order-δk+1 term as
A=0.
So Qk(δ) does indeed agree to eiδjhj up to and including order δk+1.

So Suzuki's idea is to start with some symmetric approximant Q2(δ), with
eiδjhj=Q2(δ)+O(δ3),
then to pick some parameters {pα} summing to 1 and with
αp3α=0,
in such a way that
Q3(δ)=αQ2(pαδ)
will also be symmetric. (In particular, because Q2 is symmetric, we can accomplish this by making the list {pα}Ωα=1 symmetric under the substitution αΩ+1α.) Because Q3 is symmetric, we may define Q4Q3, and then define Q5 by choosing a new sequence pα. And so on and so forth, ad infinitum. A good starting choice of Q2(δ) is
Q2(δ)=eiδh1/2eiδhn1/2eiδhneiδhn1/2eiδh1/2.
 
By repeating this recursive algorithm, we can produce, for arbitrary integers k1, an operator Qk(δ) that is made up of a sequence of exponentials of local operators satisfying
eiδjhj=Qk(δ)+O(δk+1).
The full time evolution operator,
(eiδjhj)N,
is then given by
Qk(δ)N+O(Nδk+1).
So to get an error of order ϵ, we must choose Nδk+1=tk+1/Nk to be order ϵ, i.e., we must choose N to be order
Nt(k+1)/kϵ1/k.
For the Trotter formula at k=1, we had to choose Nt2/ϵ; for sufficiently large k, this formula gives Nt! Naively, this looks like an improvement; but while the total number of local gates in the Trotter formula scaled only with N, the total number of gates in this formula scales also with the number of iterations k, because each operator Qk is made up of a number of local gates depending on how many parameters pα we choose at each level of the Suzuki ansatz. If we choose the same number Ω of parameters at each level, and denote by L the number of gates in our seed approximant Q2, then we have:
#(Q4)=#(Q3)=Ω#(Q2)=LΩ,
#(Q6)=#(Q5)=#(Q3)Ω=LΩ2,
and so on, giving the general formula
#(Qk)=LΩfloor((k1)/2).

The actual number of gates required for a Suzuki decomposition therefore goes like
N#(Qk)Ωfloor((k1)/2)t(k+1)/kϵ1/k.
For a fixed time t and error threshold ϵ, the prefactor Ωfloor((k1)/2) can make a big difference for large k. So in practice, designing a Trotter-Suzuki algorithm for a particular simulation problem involves choosing a value of k that optimizes between these two penalties: the largeness of t(k+1)/k/ϵ1/k if k is too small, and the largeness of Ωfloor((k1)/2) if k is too big.

One other practical thing to consider is that in Qk(δ), the actual time interval τ appearing in any particular local term eiτh is equal to δ times a product of k/2 parameters pα at various levels. So if the parameters pα are bigger than one, then for large k and fixed δ we might end up needing to evolve local operators for times exponentially long in k, which causes some practical issues. It is therefore useful to choose a recursion scheme so that all parameters pα have magnitude uniformly bounded below one. A particular recursion scheme that Suzuki chooses is to set Ω=5 at each level, and to set
p1=p2=p4=p5=p,
p3=14p.
This list is symmetric, so it maintains the symmetry properties of Qk for all k. For this list, the equation αpkα is given by
4pk+(14p)k=0,
which for k odd has the real solution
p=1441/k,
which satisfies
0.41(422/3)1p>1/3
and therefore
0.6614(422/3)1(14p)<1/3
for all k3. This scheme successfully bounds |pα| uniformly below one for all levels k.

Comments

Popular posts from this blog

Pick functions and operator monotones

Any time you can order mathematical objects, it is productive to ask what operations preserve the ordering. For example, real numbers have a natural ordering, and we have xyxkyk for any odd natural number k. If we further impose the assumption y0, then order preservation holds for k any positive real number. Self-adjoint operators on a Hilbert space have a natural (partial) order as well. We write A0 for a self-adjoint operator A if we have ψ|A|ψ0 for every vector |ψ, and we write AB for self-adjoint operators A and B if we have (AB)0. Curiously, many operations that are monotonic for real numbers are not monotonic for matrices. For example, the matrices P=12(1111) and Q=(0001) are both self-adjoint and positive, so we have P+QP0, but a str...

Envelopes of holomorphy and the timelike tube theorem

Complex analysis, as we usually learn it, is the study of differentiable functions from C to C. These functions have many nice properties: if they are differentiable even once then they are infinitely differentiable; in fact they are analytic, meaning they can be represented in the vicinity of any point as an absolutely convergent power series; moreover at any point z0, the power series has radius of convergence equal to the radius of the biggest disc centered at z0 which can be embedded in the domain of the function. The same basic properties hold for differentiable functions in higher complex dimensions. If Ω is a domain --- i.e., a connected open set --- in Cn, and f:ΩCn is once differentiable, then it is in fact analytic, and can be represented as a power series in a neighborhood of any point z, i.e., we have an expression like f(z)=ak1kn(z1z)k1(znz)kn. The ...

Some recent talks (Summer 2024)

My posting frequency has decreased since grad school, since while I'm spending about as much time learning as I always have, much more of my pedagogy these days ends up in papers. But I've given a few pedagogically-oriented talks recently that may be of interest to the people who read this blog. I gave a mini-course on "the algebraic approach" at Bootstrap 2024. The lecture notes can be found here , and videos are available here . The first lecture covers the basic tools of algebraic quantum field theory; the second describes the Faulkner-Leigh-Parrikar-Wang argument for the averaged null energy condition in Minkowski spacetime; the third describes recent developments on the entropy of semiclassical black holes, including my recent paper with Chris Akers . Before the paper with Chris was finished, I gave a general overview of the "crossed product" approach to black hole entropy at KITP. The video is available here . The first part of the talk goes back in ti...