This academic term, some colleagues at Stanford and I are running a journal club on Hamiltonian simulation — the problem of how to use a quantum computer to simulate the time evolution of a physical system. Hamiltonian simulation is a hot topic in research, in part because it's believed that simulating certain systems on quantum computers will allow us to probe aspects of those systems that we don't know how to access with traditional laboratory experiments.
The earliest approach to this problem, and one that is still practically useful for certain applications, makes use of the Trotter decomposition and its generalization the Trotter-Suzuki decomposition. These are algorithms for decomposing a time evolution operator that acts simultaneously on the entire quantum system, into a sequence of time evolution operators that act locally on only a few physical sites at a time. Specifically, given a time-independent Hamiltonian H=∑jhj, we would like to find a way to approximate the time evolution operator e−iHt by some product
M∏μ=1e−itμhcμ.
That is, we should be able to make some list {hcμ}Mμ=1 of local terms appearing in the Hamiltonian, evolve with each one for a time tμ, and in doing so approximate the full time evolution e−iHt.
By approximate, we mean that the two operators should be close in the operator norm. The operator norm ‖A‖ is given by the largest singular value of A, which can be realized as
‖A‖2=max|ψ⟩≠0⟨ψ|A†A|ψ⟩⟨ψ|ψ⟩.
We say that a Trotter-Suzuki decomposition of the form (1) approximates e−iHt with error at most ϵ if we can prove
‖M∏μ=1e−itμhcμ−e−iHt‖≤ϵ.
Quantum computers are designed to implement local unitaries on a few qubits at a time. The Trotter-Suzuki decomposition, because it lets us approximate a true Hamiltonian time evolution by a sequence of local unitaries, is an effective approach to simulating Hamiltonians on quantum computers.
In section 1 of this post, I review the Trotter decomposition, which appeared in this paper and is the "first-order" version of a decomposition later improved by Suzuki. In section 2, I review the Suzuki decomposition as presented in this paper. I also comment on the coefficients of higher-order error terms in the
Trotter-Suzuki decomposition, and the limitations they place on the
practical effectiveness of the Suzuki formula.
Thanks to Shaun Datta and Ani Krishna for spotting errors in this post.
Prerequisites: Only elementary quantum mechanics. Understanding the circuit model of quantum computation is probably helpful for understanding the motivation.
Table of Contents
- The Trotter decomposition
- The Suzuki decomposition
1. The Trotter decomposition
Suppose we have a Hamiltonian H expressible as a sum of local terms,
H=∑jhj,
with corresponding time evolution operator
U(t)=e−it∑jhj.
If t were small, then we could imagine approximating this operator using Taylor expansions of the exponential. But we can't put that assumption in by hand, because generally we will want to be able evolve quantum systems for long times. So the trick Trotter came up with is to subdivide the time evolution into N time steps, each of length δ=t/N, and consider the limit as N becomes large. We can write the time evolution operator as
U(t)=(e−iδ∑jhj)N.
The
Baker-Campbell Hausdorff formula tells us that the difference between
e−iδ∑jhj and
∏je−iδhj can be written as a power series in
δ, with the first term being of order
δ2. So, naively, we can write
U(t)=(∏je−iδhj+O(δ2))N=(∏je−iδhj)N+O(Nδ2).This looks pretty good! It tells us that the operator norm difference between U(t) and (∏e−iδhj)N scales like Nδ2. So if we want the total error to be ϵ, then we must choose N to satisfy
ϵ∼t2N,
or
N∼t2ϵ.
So the Trotter formula gives us a way of approximating time evolution to accuracy ϵ, when ϵ/t is small, by applying a number of local gates that scales like t2/ϵ.
Of course, there are parameters other than ϵ/t that might be relevant in a practical computation of the error. We might want to know how the error scales as we increase the total number of local terms in the Hamiltonian, or as we vary the operator norms of the local terms in the Hamiltonian. These are interesting questions, but beyond the scope of the present post. In order to address them, you'd need to use the Zassenhaus formula to bound higher-order terms in the series relating e−it∑jhj to ∏je−ithj.
2. The Suzuki decomposition
Suzuki's idea for approximating a time evolution operator by a sequence of local gates starts from the same place as the Trotter formula: let's write
U(t)=e−it∑jhj=(e−iδ∑jhj)N,
and then approximate e−iδ∑jhj for small δ by a sequence of local operators,
e−iδ∑jhj=e−iδt1hc1…e−iδtMhcM+O(δk),
where k≥2 is an integer that we'd like to make as large as possible to get the best possible approximation.
The Trotter formula tells us we can set hcj=hj and tj=1 in equation (2) and get an error term of order δ2. But we can certainly imagine there being longer strings of local operators hcμ and times tμ that give us even better approximations! A priori, the only restriction on the operators hcμ and the times tμ is that the order-δ terms on each side of equation (2) should match, which gives us
∑jhj=∑μtμhcμ.
If the local terms in the Hamiltonian are all linearly independent, then this is just the statement that on the right-hand side of (2), each local term hj should be applied for total time δ. You obtain the Trotter formula by doing the whole δ-length time evolution at once for each local term in sequence.
We can think of equation (3) as the unique restriction on equation (2) that constraints the error term to be order δ2 or higher. We could try to continue this way of thinking, and ask, "what restrictions are placed on tμ and hcμ by requiring the error term to be order δ3 or higher?" If we were to match the order-δ2 terms on both sides of equation (2) and simplify, we'd obtain the formula
(∑jhj)2=∑μ≠νtμtνhcμhcν+(∑μtμhcμ)2.
Thanks to equation (3) we know the left side of the equation cancels the second term on the right side, and we obtain the restriction
∑μ≠νtμtνhcμhcν=0.
This restriction isn't so bad. But already it doesn't have as nice of a physical interpretation as does (3), and we can see that as we go to higher and higher order we're going to produce more and more nontrivial restrictions on the operators hcμ and times tμ.
Suzuki's idea was to use a simple ansatz for a string of exponentials that might work to approximate e−iδ∑jhj up to order δk, then to write equation (2) as a constraint on the parameters in that ansatz. The ansatz is recursive. To start, assume there is an operator Qk−1(δ) satisfying
e−iδ∑jhj=Qk−1(δ)+O(δk).
We will construct an ansatz for the next-order approximation Qk by applying Qk−1(δ) repeatedly for some sequence of times p1δ,…,pΩδ, i.e., we will write
Qk(δ)=Ω∏α=1Qk−1(pαδ).
We now write
e−iδ∑jhj=Qk(δ)+O(δk+1),
and ask whether there exist numbers {pα} so that this equation holds.
As a first observation, we note that for k≥2 the operators Qk−1(δ) and Qk(δ) must agree at linear order in δ, which implies the restriction ∑αpα=1. We then write
e−iδ∑jhj=∏αe−iδpα∑jhj,
and substitute in equation (5). When we substitute, it will be helpful to write out the order-δk term explicitly, so we have
e−iδpα∑jhj=Qk−1(pαδ)+pkαδkA+O(δk+1).
After substitution, we obtain
e−iδ∑jhj=∏α(Qk−1(pαδ)+pkαδkA+O(δk+1)).
Expanding out the product to leading order, we obtain
e−iδ∑jhj=Qk(δ)+∑αpkαδkA+O(δk+1).
So if we fix the coefficients pα so that they satisfy ∑αpkα=0, then all lingering terms at order δk sum to zero, and we are left with equation (6).
This is pretty good, but unfortunately it appears we have a little bit of a problem. For odd k, we can certainly find coefficients pα satisfying ∑αpα=1 and ∑αpkα=0. For example, the coefficients
p=p1=p2=⋯=pΩ−1,pΩ=1−(Ω−1)p
satisfy ∑αpα=1, and the equation ∑αpkα=0 becomes
(Ω−1)pk+(1−(Ω−1)p)k=0,
which has a real solution p for any odd k and Ω>1. (This is because solutions to polynomials with real coefficients come in complex conjugate pairs, so if a polynomial has an odd number of roots then at least one of them must be real.) However, for even k, the only solution to ∑αpkα=0 is p1=⋯=pΩ=0, which does not satisfy the other constraint ∑αpα=1.
So it appears, for the moment at least, that Suzuki's ansatz has hit a critical problem. It simply cannot be applied to produce an operator Qk for even k. But there is a solution! What if, for odd k, we were able to show
e−iδ∑jhj=Qk(δ)+O(δk+2)?
Then we could define the even term Qk+1 as equal to Qk, and define the next odd term Qk+2 from Qk using Suzuki's ansatz.
In particular, Suzuki shows that equation (7) holds if Qk has the symmetry property Qk(δ)Qk(−δ)=1. The proof is simple: we write the order-δk+1 term of the time evolution operator explicitly as
e−iδ∑jhj=Qk(δ)+Aδk+1+O(δk+2),
then expand the identity 1=eiδ∑jhje−iδ∑jhj in δ to obtain
1=Qk(−δ)Aδk+1+AQk(δ)(−δ)k+1+O(δk+2).
The order-δk+1 term on the right side of this equation must vanish. Since k is odd, we have (−δ)k+1=δk+1. This fact, combined with the expansion Qk(δ)=1+O(δ), gives us the vanishing of the order-δk+1 term as
A=0.
So Qk(δ) does indeed agree to e−iδ∑jhj up to and including order δk+1.
So Suzuki's idea is to start with some symmetric approximant Q2(δ), with
e−iδ∑jhj=Q2(δ)+O(δ3),
then to pick some parameters {pα} summing to 1 and with
∑αp3α=0,
in such a way that
Q3(δ)=∏αQ2(pαδ)
will also be symmetric. (In particular, because Q2 is symmetric, we can accomplish this by making the list {pα}Ωα=1 symmetric under the substitution α↦Ω+1−α.) Because Q3 is symmetric, we may define Q4≡Q3, and then define Q5 by choosing a new sequence pα. And so on and so forth, ad infinitum. A good starting choice of Q2(δ) is
Q2(δ)=e−iδh1/2…e−iδhn−1/2e−iδhne−iδhn−1/2…e−iδh1/2.
By repeating this recursive algorithm, we can produce, for arbitrary integers k≥1, an operator Qk(δ) that is made up of a sequence of exponentials of local operators satisfying
e−iδ∑jhj=Qk(δ)+O(δk+1).
The full time evolution operator,
(e−iδ∑jhj)N,
is then given by
Qk(δ)N+O(Nδk+1).
So to get an error of order ϵ, we must choose Nδk+1=tk+1/Nk to be order ϵ, i.e., we must choose N to be order
N∼t(k+1)/kϵ1/k.
For the Trotter formula at k=1, we had to choose N∼t2/ϵ; for sufficiently large k, this formula gives N∼t! Naively, this looks like an improvement; but while the total number of local gates in the Trotter formula scaled only with N, the total number of gates in this formula scales also with the number of iterations k, because each operator Qk is made up of a number of local gates depending on how many parameters pα we choose at each level of the Suzuki ansatz. If we choose the same number Ω of parameters at each level, and denote by L the number of gates in our seed approximant Q2, then we have:
#(Q4)=#(Q3)=Ω#(Q2)=LΩ,
#(Q6)=#(Q5)=#(Q3)Ω=LΩ2,
and so on, giving the general formula
#(Qk)=LΩfloor((k−1)/2).
The actual number of gates required for a Suzuki decomposition therefore goes like
N#(Qk)∼Ωfloor((k−1)/2)t(k+1)/kϵ1/k.
For a fixed time t and error threshold ϵ, the prefactor Ωfloor((k−1)/2) can make a big difference for large k. So in practice, designing a Trotter-Suzuki algorithm for a particular simulation problem involves choosing a value of k that optimizes between these two penalties: the largeness of t(k+1)/k/ϵ1/k if k is too small, and the largeness of Ωfloor((k−1)/2) if k is too big.
One other practical thing to consider is that in Qk(δ), the actual time interval τ appearing in any particular local term e−iτh is equal to δ times a product of ∼k/2 parameters pα at various levels. So if the parameters pα are bigger than one, then for large k and fixed δ we might end up needing to evolve local operators for times exponentially long in k, which causes some practical issues. It is therefore useful to choose a recursion scheme so that all parameters pα have magnitude uniformly bounded below one. A particular recursion scheme that Suzuki chooses is to set Ω=5 at each level, and to set
p1=p2=p4=p5=p,
p3=1−4p.
This list is symmetric, so it maintains the symmetry properties of Qk for all k. For this list, the equation ∑αpkα is given by
4pk+(1−4p)k=0,
which for k odd has the real solution
p=14−41/k,
which satisfies
0.41≈(4−22/3)−1≥p>1/3
and therefore
−0.66≈1−4(4−22/3)−1≤(1−4p)<−1/3
for all k≥3. This scheme successfully bounds |pα| uniformly below one for all levels k.
Comments
Post a Comment