by Amara Graps
Benchmarks for quantum computers are blooming. From the (not serious) Weird Al benchmark, to the (very serious) Quantum Benchmarking Initiative by DARPA, the quantum community has advanced quite a distance in the last decade. I have almost forty quantum computing benchmarking papers saved in my personal subdirectory now.
Let’s put a frame around these benchmarks with help from Amico et al., 2023 published in IEEE blog highlight. (See also their paper in arXiv).
The Difference between Standardized and Diagnostic Benchmarks
The authors say that many of the benchmarks to which we’ve grown accustomed are not capturing a device’s overall performance in an average sense, instead, they are capturing the functionality of particular algorithms running on particular hardware. Such diagnostic methods are extremely sensitive to individual error sources or device components.
Diagnostic benchmarks would be like those inside of application-oriented circuit libraries, which are becoming more useful as a result. Their goal is to gather different algorithms with different quantum circuit configurations, to capture the functionality of quantum technology. Because they average across many circuits, they can, depending on how the results are interpreted, become actual benchmarks when used collectively.
Whereas standardized benchmarks stress these key characteristics:
- Randomized: Eliminate biases and ensure statistically significant results
- Well-defined: With clear specifications and implementation procedures, leaving no room for ambiguity.
- Holistic: Encompassing various aspects of device performance, not just focusing on specific strengths.
- Device-independent: Applicable to different technologies, fostering inclusivity across the field.
Valuable Features of Diagnostic Benchmarks
Let’s dive deeper into Diagnostic Benchmarks. According to Amico et al., 2023, their valuable features are:
- Definition and Sensitivity:
- Diagnostic benchmarks are protocols that are highly sensitive to specific types of errors, such as the Hellinger fidelity of GHZ states. They are designed to provide a clear characterization of performance in particular settings.
- Predictive Power:
- These methods are highly predictive for similarly structured problems, making them useful for specific tasks or applications. However, their specificity means they are not good standards for general benchmarking.
- Utility in Application-Oriented Methods:
- Diagnostic methods are particularly useful in application-oriented circuit libraries, which aim to capture the performance of quantum hardware by averaging over a wide set of circuits. This can elevate them to the level of true benchmarks when executed in aggregate.
- Compilation and Mitigation Techniques:
- In the context of maximizing performance for specific applications, diagnostic methods can incorporate compilation and mitigation techniques. This contrasts with benchmarking methods, which aim to characterize average performance across a range of tasks.
By understanding these points, one can appreciate the role and limitations of diagnostic benchmarks in the broader context of quantum computing performance evaluation.
GQI’s Quantum Tech Stack Approach
Now that Diagnostic Benchmarks have a firmer definition, we can consider to which part of the Quantum Tech Stack they apply. As seen in the next figure, many of these benchmarks apply to the Top Stack.
GQI’s quantum technology stack consists of seven layers. Looking bottom up we have: Quantum Plane, Control Plane, Control Logic, Architecture, Framework (divided into Framework (Hybrid) and Framework (Quantum)), Algorithms and Applications.
The Bottom (Hardware) stack (Control Plane and Quantum Plane) is primarily focused on the physical components of the quantum computing system. This includes the management of individual qubits and their interconnections, and the development of more complex configurations such as quantum chips and modules. Furthermore, it encompasses all the essential hardware for managing, sustaining, and executing Quantum Processing Units (QPUs).
The Top Stack consisting of Framework (Hybrid), Algorithms, Applications is facing the user community. It includes advanced computational workflows, parallelization and runtimes, and quantum algorithms and comprehensive libraries of such algorithms. All these components work together to execute applications using high-level programming languages that are tailored to specific industries and use cases.
What are examples of these diagnostic benchmarks, which were highlighted by the Amico et al, 2023 articles?
Examples of Diagnostic Benchmarks
• Finžgar, Jernej Rudi, Philipp Ross, Leonhard Hölscher, Johannes Klepsch, and Andre Luckow. 2022. “QUARK: A Framework for Quantum Computing Application Benchmarking.” arXiv. https://doi.org/10.48550/arXiv.2202.03028.
• Kordzanganeh, Mohammad, Markus Buchberger, Basil Kyriacou, Maxim Povolotskii, Wilhelm Fischer, Andrii Kurkin, Wilfrid Somogyi, Asel Sagingalieva, Markus Pflitsch, and Alexey Melnikov. 2023a. “Benchmarking Simulated and Physical Quantum Processing Units Using Quantum and Hybrid Algorithms.” https://doi.org/10.1002/qute.202300043.
• Kurlej, Arthur, Sam Alterman, and Kevin M. Obenland. 2022. “Benchmarking and Analysis of Noisy Intermediate-Scale Trapped Ion Quantum Computing Architectures.” In 2022 IEEE International Conference on Quantum Computing and Engineering (QCE), 247–58. https://doi.org/10.1109/QCE53715.2022.00044.
• Li, Ang, Samuel Stein, Sriram Krishnamoorthy, and James Ang. 2022. “QASMBench: A Low-Level QASM Benchmark Suite for NISQ Evaluation and Simulation.” ACM Transactions on Quantum Computing 4 (2): 1–26. https://doi.org/10.1145/3550488.
• Lubinski, Thomas, Sonika Johri, Paul Varosy, Jeremiah Coleman, Luning Zhao, Jason Necaise, Charles H. Baldwin, Karl Mayer, and Timothy Proctor. 2023. “Application-Oriented Performance Benchmarks for Quantum Computing.” arXiv. http://arxiv.org/abs/2110.03137.
• Lubinski, Thomas, Carleton Coffrin, Catherine McGeoch, Pratik Sathe, Joshua Apanavicius, and David E. Bernal Neira. 2024. “Optimization Applications as Quantum Performance Benchmarks.” arXiv. https://doi.org/10.48550/arXiv.2302.02278.
• Mesman, Koen, Zaid Al-Ars, and Matthias Möller. 2022. “QPack: Quantum Approximate Optimization Algorithms as Universal Benchmark for Quantum Computers.” arXiv. https://doi.org/10.48550/arXiv.2103.17193.
• Mundada, Pranav S., Aaron Barbosa, Smarak Maity, Yulun Wang, T. M. Stace, Thomas Merkh, Felicity Nielson, et al. 2023a. “Experimental Benchmarking of an Automated Deterministic Error Suppression Workflow for Quantum Algorithms.” arXiv. https://doi.org/10.48550/arXiv.2209.06864.
• Tomesh, Teague, Pranav Gokhale, Victory Omole, Gokul Subramanian Ravi, Kaitlin N. Smith, Joshua Viszlai, Xin-Chuan Wu, Nikos Hardavellas, Margaret R. Martonosi, and Frederic T. Chong. 2022. “SupermarQ: A Scalable Quantum Benchmark Suite.” arXiv. http://arxiv.org/abs/2202.11045.
• Zhang, Victoria, and Paul D. Nation. 2023. “Characterizing Quantum Processors Using Discrete Time Crystals.” arXiv. https://doi.org/10.48550/arXiv.2301.07625.
We featured some of Kordzanganeh’s results in QCR, last time. In the article next time, I’ll describe my favorite benchmarks: Lubinski, et al.’s Application-Oriented Benchmarks.
(*) The Quantum Tech Stack concept is threaded throughout GQI’s method of analyzing quantum technology developments. This slide is from GQI’s Quantum Hardware State of Play, which is a 47 slide deck that steps through the stacks with accompanying latest developments and how to evaluate the developments. If you are interested to see this State of Play or any of the other State of Plays: Quantum Technology, Quantum Safe, Quantum Sensing, Imaging, and Timing, Quantum Software, Quantum Landscape please don’t hesitate to contact [email protected].
October 18, 2024