By Yuval Boger

Performance benchmarks provide users with a frame of reference to compare products in the same category. Many popular classical benchmarking tools exist, such as MLPerf for machine learning, PassMark, and 3DMark for GPUs. It stands to reason that quantum computing users can benefit from similar tools. 

Benchmarks are critical as users struggle to translate hardware characteristics such as gate fidelity, coherence times, and qubit connectivity into meaningful business insight. After all, as enjoyable as the underlying technology might be, business users want to know how soon they can get valuable results from a given computer or, more generally, which computer (if any) is best to solve a particular problem. Benchmarks are also helpful to validate claims from vendors, such as claims about gate fidelity or the efficacy of error correction, and serve as internal development tools for such vendors.

Indeed, several commercial, academic, and standards organizations have launched benchmarking, such as those from IBM, QED-C, Super.tech, The Unitary Fund, Sandia National Labs, and Atos

These benchmarking suites typically fall into two categories: 1) system performance tests that  measure hardware-related parameters such as speed and noise, and 2) application performance tests that compares simulated results of reference algorithms to actual execution results on various hardware platforms.

In my opinion, one type of benchmark that’s missing is a way to determine the best hardware to execute a bespoke algorithm or program that an organization has developed. Some might call this “predictive benchmarking,” which might also consider the known or measured imperfections of a particular platform to predict and recommend the best one for a given application. Such predictive benchmarking is interesting for two reasons: 1) there could be dramatic variance in execution quality between different quantum computers, and 2) because organizations have access to multiple types of machines through quantum cloud providers, it would not be difficult to switch platforms if the results warrant it.

Recently, I had the opportunity to discuss benchmarking with Pranav Gokhale, VP of Quantum Software of ColdQuanta (and formerly CEO of Super.tech, acquired by ColdQuanta). Gokhale and his coworkers started working on benchmarking in the middle of 2021 and published their suite of open-source benchmarks, called SupermarQ as well as comparative measurements earlier this year. SupermarQ includes application-centric tests in domains such as logistics, finance, chemistry, and encryption, while also including a test measuring error correction performance. Pranav mentioned that a key design goal of their suite was to allow it to scale to a large number of qubits while maintaining a seemingly conflicting goal of classic verifiability.

I asked Pranav about market feedback on their product. He mentioned significant commercial and academic interest in benchmarking various algorithms and devices and interest from hardware vendors that leverage SupermarQ to track progress in their hardware development. Interestingly, Pranav reports that SupermarQ results often diverge significantly from predicted results that rely solely on qubit coherence and gate fidelity numbers. He says this happens because imperfections are often correlated (such as qubit crosstalk). As such, Super.tech believes their benchmark suite helps de-hype the quantum market, demonstrating real-world performance metrics for quantum computers.

Many hardware vendors could have legitimate claims about the inaccuracy of benchmarking suites. Vendors might claim they can rewrite and optimize these test applications for their  platforms by using platform-specific features, native gates, or a better configuration of their transpiler. As several recent hackathons and coding competitions have shown, there are numerous ways to implement any given algorithm, sometimes differing in orders of magnitude in their efficiency.

In classical machine learning, Alexnet, the winner in a global competition to develop an image classification algorithm, revolutionized the field. Suppose quantum computing organizations initiated similar efforts, providing sample data sets and seeking the best quantum solution. In that case, vendors could demonstrate the power of their quantum platforms with optimal algorithms and settings. Both end-users and researchers might benefit from such efforts.

Benchmarking is important. Without it, we’d be comparing the proverbial apples to oranges. But quantum benchmarking still appears to be in its infancy.

Yuval Boger is a quantum computing executive. Known as the original “Qubit Guy,” he most recently served as Chief Marketing Officer for Classiq.

October 24, 2022