We analyze randomized benchmarking for arbitrary gate-dependent noise and prove that the exact impact of gate-dependent noise can be described by a single perturbation term that decays exponentially with the sequence length. That is, the exact behavior of randomized benchmarking under general gate-dependent noise converges exponentially to a true exponential decay of exactly the same form as that predicted by previous analysis for gate-independent noise. Moreover, we show that the operational meaning of the decay parameter for gate-dependent noise is essentially unchanged, that is, we show that it quantifies the average fidelity of the noise between ideal gates. We numerically demonstrate that our analysis is valid for strongly gate-dependent noise models. We also show why alternative analyses do not provide a rigorous justification for the empirical success of randomized benchmarking with gate-dependent noise.

Author comments: It measures what you expect. Comments welcome. v2: removed an inconsistent assumption from theorem 3 and clarified discussion of prior work. Results unchanged. v3: further clarified discussion of prior work, numerics now available at https://github.com/jjwallman/numerics. v4: licence change as required by Quantumhttp://arxiv.org/abs/1703.09835

http://arxiv.org/pdf/1703.09835.pdf

So what is the deal?

Does this negate all the problems with https://scirate.com/arxiv/1702.01853 ?

That's a hard question to answer. I suspect that on any questions that aren't precisely stated (and technical), there's going to be some disagreement between the authors of the two papers. After one read-through, my tentative view is that each of the two papers addresses three topics which are pretty parallel. They come to similar conclusions about the first two: (1) The pre-existing theory of RB didn't give the right answers in some circumstances (sometimes by orders of magnitude); and (2) RB fidelity decays are always really close to exponential decays. [This paper's analysis appears to be stronger, as it should be; I think it gives a different reason for (1) and a stronger proof for (2)]. The two papers appear to come to different conclusions about the third shared question: (3) Does the RB number correspond to a "fidelity"? The earlier paper concluded "Not obviously, and definitely not the usual definition." This paper concludes that it does correspond to a fidelity. I think the argument/derivation looks intriguing and promising, but I haven't wrapped my head around it yet.

Oh, and obviously I'm not an unbiased observer, as I'm a co-author of the earlier paper.

I disagree with the assertion (1) that the previous theory didn't give "the right answers." The previous theory was sound; no one is claiming that there are any mistakes in any of the proofs. However, there were nonetheless some issues.

The first issue is that the previous analysis of gate-dependent noise was, while sound, too weak to be useful in practical situations. This was folklore among the cognoscenti and a couple of papers had been written about this, but to my knowledge, a clear general statement of this fact was lacking before 1702.01853 came out. Second, because the gate-_dependent_ noise theory was not useful, it is tempting to use the gate-_independent_ noise results even when applying RB in a setting with gate-dependent noise. One might be forgiven for assuming that the weakness in the theoretical bounds was a limitation of the theory and not a concern for realistic applications.

The important contribution of 1702.01853 was to show that there are reasonably realistic scenarios where a practitioner could mislead themselves while using RB in the presence of gate-dependent noise, even if the gate dependence is very weak. The missteps would follow from (a) the experiment not respecting the assumptions of the theory and then (b) applying the theory anyway. Their example involves giving all of the gates a small gate-dependent unitary offset, and then showing that this offset collection of gates does not have a good fidelity with the "true" gates that one was aiming for, even though a basic RB experiment (using (a) and (b)) would conclude that the fidelity was high.

What Joel's paper shows is that one can essentially always reinterpret what one means by "the true gates" such that the experimentally determined RB fidelity matches the average fidelity of the gates, even with realistic gate-dependent noise. The pathologies identified by the authors of 1702.01853 arise when one attempts to ascribe an invariant meaning to a gauge-dependent quantity, as is done for example in Gate Set Tomography (GST). Joel shows that, although there can be no preferred gauge, RB experiments do indeed measure a fidelity for a set of gates whose distribution in gate-dependent noise has mean zero. (This is point (3) that Robin mentions.) Moreover, the experiment is well described by an exponential decay curve (point (2) above).

Therefore I think it is accurate to say that, yes, it convincingly address the most important issues raised by 1702.01853.

I agree with much of your comment. But, the assertion you're disagreeing with isn't really mine. I was trying to summarize the content of the present paper (and 1702.01853, hereafter referred to as [PRYSB]). I'll quote a few passages from the present paper to support my interpretation:

1. "[T]he `weakly' gate-dependent noise condition assumed in Ref. [27] does not hold for realistic noise... For the analysis of Ref. [27] to be valid, theta has to be calibrated to within 1e-5 radians" (page 1).

2. "[T]he original analysis of Ref. [27] only applies in limited regimes" (page 2).

3. "...even when the analysis of Ref. [27] differs from the observed decay rate by orders of magnitude." (page 10).

4. "As observed by Ref. [29], the predicted decay using the value from Ref. [27] is in stark disagreement with the observed value." (Fig. 1)

5. "...even when the analysis of Ref. [27] is inconsistent by orders of magnitude." (page 12).

When you say "the previous theory was sound", I'm guessing you mean that error bounds on the 0th and 1st order models (the latter of which is explicitly intended as a predictive theory for gate-dependent noise) were derived, and the predictions are correct to within those error bounds. I concur (and [PRYSB] stated this).

But while the theorems are correct, the error bounds were ignored so thoroughly in practice (including by some of the cognoscenti) that I think it's difficult to avoid associating "the previous theory" with the explicit prediction of the 1st order model in [27], *sans* error bounds. Which is my reading of the quotes above (even though it's 100% clear in other parts of the paper that Joel appreciates the nuance about error bounds).

Two more questions for you (re: your comment)

1. You suggest the key is to "reinterpret what one means by `the true gates' such that the [...] RB fidelity matches the average fidelity of the gates". This wasn't quite how I was reading the paper. I would have said that this paper's key contribution is to redefine **the error in each gate**. In previous theory, this was defined as $C^\dagger \tilde{C}$, but Joel shows that if we define it instead as $\mathcal{R}\mathcal{L}$, with a particular choice of the non-unique $\tilde{C} = \mathcal{L} C \mathcal{R}$ decomposition, then happiness ensues. It seems like the target gates aren't reinterpreted, just what "the noise on the gate" means?

2. With apologies if this sounds defensive... where in [properly done] GST does one "attempt to ascribe an invariant meaning to a gauge-dependent quantity?" AFAIK, every publication on GST has been aware of gauge, and when gauge-variant quantities are requested, the standard approach is to turn them into gauge-invariant ones by explicitly minimizing them over gauge. While this is certainly an imperfect approach, I'd suggest that the GST community has been in the forefront of *not* ascribing invariant meaning to gauge-dependent quantities.

Yes, I did indeed mean that the results of the previous derivations are correct and that predictions from experiments lie within the stated error bounds. To me, it is a different issue if someone derives something with a theoretical guarantee that might have sufficient conditions that are too strong, and someone wants to use this theory, but they can't match the sufficient conditions, so they appeal to heuristics to argue that they can use the theory anyway. Empirical science couldn't proceed if people didn't do this, and it's fine as long as the assumptions are stated clearly. But if it fails, it is not that the theory is broken, just that it's domain of applicability is not as wide as the heuristics would suggest. I suspect we are both in complete agreement on these points, and I am only emphasizing this particular wording so as to avoid impugning the credibility of the authors of [27], who have made many valuable contributions to the theory.

The fact that the rigorous analysis was ignored in practice is indeed a problem, I completely agree with you. I'm well aware that some RB estimates in the literature are not particularly rigorous, and that one can severely mislead oneself; see e.g. my March Meeting talk from 2016. It's great to see that the community is homing in on the precise issues that lead to pitfalls, and both [PRYSB] and Joel's paper make important steps in this direction.

Regarding your two additional questions/comments

1. If you prefer to say it's the error in each gate, I don't disagree. Either perspective seems valid to me, and I wasn't trying to make a mathematical statement in any case (hence the scare quotes).

2. I knew you wouldn't be able to resist my troll on GST. :) One thing that GST does do is output specific gates, preparations, and measurements simultaneously. As you yourself have relentlessly stressed, this isn't actually possible without picking a gauge. So, while you implicitly report the gauge in which the gate set is defined (see e.g. the discussion in Section IV.E of your paper, 1605.07674), you also report specific numbers for gauge-variant quantities like fidelity and diamond norm without really analyzing the spread or potential implications that might arise from different choices of gauge. It seems that in light of these recent papers, one ought to carefully revisit these results to ensure that the same difficulties that you've highlighted with gate-dependent noise in RB don't also apply to the gauge-variant metrics reported by GST.

I agree that we pretty much agree on all these points! For the record, though... when you describe the pre-2017 state of play as "...*someone wants to use this theory, but they can't match the sufficient conditions, so they appeal to heuristics to argue that they can use the theory anyway*," this implies that most practioners were aware of how loose the error bounds on the estimated RB number were, but used them anyway (appealing to heuristics). My impression was that very few scientists realized just how loose the bounds were -- and that the published estimates were being used in the mistaken belief that that they were precise.

Fortunately, this is all water under the bridge now! Hopefully, Joel's new theory does indeed provide a precise, computable, and interpretable (as fidelity) estimate of the RB number.

Inasmuch as I've been able to work through it, the paper makes sense to me so far. The one real concern I have is whether the new estimate -- $1-p(\mathcal{R}\mathcal{L})$ -- is actually computable. There are many, many ways to choose $\mathcal{R}$ and $\mathcal{L}$, and not all of them yield useful estimates. If I understand correctly, Theorems 2-4 prove that there exists a "good" choice, such that $r \approx 1-p(\mathcal{R}\mathcal{L})$. Theorem 2 gives a generalized eigenvalue problem from which suitable $\mathcal{R}$ and $\mathcal{L}$ can be extracted -- but these are still non-unique, and you have to find ones consistent with Theorem 3. At this point I get a blinding headache, and I can't figure out whether there's a simple explicit formula for the estimate (as there was in the old theory). Maybe Joel will tell us.

Finally... well trolled. :) In the interest of staying on topic, I'll just say: yes, good point, I agree that it would be a really good idea to think hard about different choices of gauge -- we've done some of that, but more would be good. The Right Thing To Do is to use gauge-invariant quantities (like the RB number!) exclusively, but that seems hard.

Regarding the pre-2017 state of play, I think experimentalists knew there was a problem for large errors and theorists had known there was a potential problem (as shown by grant proposals for the QCVV program) but nobody had really sat down and thought about how everything behaved for small errors. I knew how bad the bounds were since 2015 but also knew that they could be substantially improved by expanding around the twirl of the average noise. This argument gives good bounds for noise that is primarily stochastic (as quantified by the fidelity), but the current analysis was needed for useful bounds for coherent errors (which are more relevant for gate-dependent noise).

The new estimate is indeed computable and was used to compute the estimates in the numerics section. The algorithm is implicit in the proof of theorem 2: solve some eigenvector problems (eq. 24a and c and eq. 27) and combine the solutions to obtain solutions to eq. 21 a and b. Then rescale one or both of the solutions by multiplying by a depolarizing channel to satisfy eq. 21c. FYI: An easy mistake is to forget the transpose in eq. 27 or to get the reshaping (i.e., the inverse of the vectorization map) transposed.

Theorem 3 holds for any choice of L and R satisfying all three constraints in theorem 2. Also, once you have a specific choice of L and R satisfying theorem 2 theorem 3 becomes irrelevant, as you can simply compute a direct bound on the size of the perturbation term. Also, as far as I can tell, theorem 3 is somewhat conservative.

I have posted an open review of this paper here: https://github.com/csferrie/openreviews/blob/master/arxiv.1703.09835/arxiv.1703.09835.md