Introduction

Quantum optimization on noisy intermediate-scale quantum (NISQ) hardware offers a practical pathway to explore potential speedups for combinatorial and real-world optimization problems. Unlike fault-tolerant approaches that await future hardware, near-term methods leverage current devices by combining quantum subroutines with classical processing. This article reviews recent experimental and theoretical advances implemented on actual quantum processors, focusing on canonical problems such as spin-glass Ising models, higher-order binary optimization, Max-Cut, and maximum independent set.

Many vendors have reported signs of quantum advantage or supremacy in various contexts. Here, we examine six representative works that provide concrete benchmarks on today’s hardware (in no particular order):

Q-CTRL: Quantum optimization using a 127-qubit gate-model IBM quantum computer can outperform quantum annealers for nontrivial binary optimization problems

KipuQuantum:  Runtime Quantum Advantage with Digital Quantum Optimization

USRA and Rigetti:  A Multilevel Approach For Solving Large-Scale QUBO Problems With Noisy Hybrid Quantum Approximate Optimization

QuEra: Solving non-native combinatorial optimization problems using hybrid quantum-classical algorithms

D-Wave: Computational supremacy in quantum simulation

IonQ: Performant near-term quantum combinatorial optimization

Key Evaluation Metrics

The papers reviewed in this article were selected based on two guiding objectives: (1) to showcase a variety of hardware architectures, and (2) to highlight promising results that meet objective, rigorous evaluation criteria. These criteria are outlined below.

  • Classical baselines:
    Benchmarking against state‑of‑the‑art classical solvers ensures that any observed quantum advantage reflects a genuine hardware-based improvement rather than an overly simple problem instance. Classical heuristics and exact solvers differ depending on the problem type—for example, matrix product states (MPS) are optimal for spin-glass models, whereas the Goemans–Williamson algorithm is standard for Max-Cut. Selecting the strongest classical comparator is essential for meaningful validation.
  • Quantum vs classical analog:
    Many quantum algorithms can substitute their quantum sampling step with a classical probabilistic analog, such as simulated annealing, while leaving the rest of the algorithm unchanged. This comparison isolates the unique contribution of the quantum hardware and tests whether phenomena like superposition, entanglement, or tunneling lead to measurable performance gains. In some cases, this analysis also inspires new classical heuristics that are competitive in their own right, thereby contributing to classical optimization research.
  • Wall‑clock time reporting:
    Reporting only raw QPU execution time neglects the complete workflow overhead, including compilation, queuing, and classical pre/post-processing. Providing total wall‑clock time offers a realistic view of throughput and is vital for practical deployment. This principle has been emphasized by Catherine McGeoch in How NOT to Fool the Masses When Giving Performance Results for Quantum Computers.
  • Solution quality vs computational effort:
    Solution quality is commonly measured in 2 ways: Likelihood (or success probability) measures the probability that an algorithm returns the exact optimal solution when run, which is especially relevant for stochastic or sampling-based methods like quantum circuits or annealers. Approximation ratio compares the value of the best solution found to the true optimal value, indicating how close the solution is to optimal, regardless of how often it is found.  The key difference is that likelihood captures how reliably the optimal solution is found, while approximation ratio captures how good the best-found solution is, even if it occurs only once.
  • Approximation ratio measures how close the quantum algorithm’s best answer is to the optimal solution, in a percentage.  Quantum heuristics often show different convergence behaviors—some may approach high‑quality solutions quickly and then plateau, while others improve gradually. Plotting approximation ratio against computational effort (e.g., iterations, wall-clock time, or circuit depth) illuminates trade-offs between speed and accuracy. As McGeoch notes, since the optimal solution is theoretically reachable with infinite computational resources, the key question becomes: “How much solution quality does the algorithm deliver per unit of computation time?“.
  • QPU resource usage:
    The scalability of quantum algorithms is constrained by available hardware resources—particularly qubit count, two-qubit gate depth, and total gate operations. Quantifying resource usage as a function of problem size helps clarify both performance bottlenecks and opportunities for optimization, providing critical feedback to algorithm designers and hardware engineers alike.
  • Problem size solved:
    Real-world optimization problems are often large and complex. Demonstrating the ability to solve large instances—especially those that exceed the native capacity of current quantum devices—requires creative decomposition and mapping strategies. Tracking the maximum problem size solved helps gauge algorithmic maturity and indicates how best to utilize the capabilities of today’s hardware.

The table below compares the results from the six papers discussed above. Citation numbers refer to the numbered references. They are presented in no particular order. Here, n represents the number of nodes in the problem graph, and p, the circuit depth.

Comparison Table

#Paper Title & CitationProblem Size SolvedQuantum Resources RequiredRuntime / IterationsSolution QualityPraiseCriticisms
1Quantum optimization using a 127-qubit gate-model IBM quantum computer can outperform quantum annealers for nontrivial binary optimization problems– Max-Cut: up to 120 qubits (3-regular)
– Spin-glass: 127 qubits
– Higher-order binary optimization: 127 variables
– IBM Quantum System, up to 127 qubits
– Modified QAOA ansatz
– Error suppression and QPU-overhead-free post-processing – Gate count scales as O(p·n) two-qubit gates (where p = QAOA depth, n = number of qubits); for dense graphs, additional SWAP gates may be required for connectivity
– Full hardware execution, no classical simulation
– Small problems (nodes < 32): wall clock time ~ 6 min, QPU time ~ 150 sec. – Large problems (nodes > 32) wall clock time ~20 min, QPU time 7-8 min
– 99.5%+ approximation ratio for spin-glass
– Up to 1,500× improvement over D-Wave annealers
– High approximation ratio
– Hardware agnostic, although results only shown for IBM’s QPUs
– Compared to CPLEX and greedy heuristic local solver, but not state-of-the-art classical heuristics
– No thorough runtime reporting, and runtimes/QPU times are high considering problem size
– It would be more compelling if they also ran their hardware-agnostic algorithm on another platform besides IBM
– For a fair comparison, post-processing for the other benchmark algorithms is necessary
– Classical solver used is not state-of-the-art (local heuristic)
2Runtime Quantum Advantage with Digital Quantum Optimization– Higher-order unconstrained binary optimization (HUBO)
– Problem sizes up to 156 qubits (IBM device)
– IBM 156-qubit device, IBM Fez
– BF-DCQO algorithm (non-variational)
– Classical pre/post-processing – Single- and two-qubit gate counts per iteration scale sub-quadratically with problem size and decrease as the number of iterations increases.
– Quantum hardware: seconds
– Classical (SA/CPLEX): fractions of minutes
– Quantum achieves ≥10× speedup
– Quantum meets or exceeds classical solution quality
– Runtime quantum advantage shown for selected problems
– High approximation ratio
– Wall-clock times are reported
– QPU gate counts scale well with problem size
– Bench- marks against commercial solvers (not state-of-the-art).
– Limited details on problem instance selection
– Did not report results for the same algorithm, with the samples obtained classically (i.e. the classical analog of BF-DCQO).
3A Multilevel Approach For Solving Large-Scale QUBO Problems With Noisy Hybrid Quantum Approximate Optimization– 82-qubit fully connected Sherrington-Kirkpatrick graphs
– Graphs of size up to ~27k nodes, and 1 mil edges via multilevel decomposition
– Rigetti Ankaa-2 QPU (82 qubits per subproblem)
– Time-blocking (TB-QAOA) with depth p=1, and 16 blocks, uses about 65 RX gates and 48 ISWAP gates.
– up to 600k QPU samples per sub-problem
– (10 seconds of QPU time per sub-problem) x (20 to 60 sub-problems for a single problem) = up to 600 seconds of QPU time per problem– Approx. ratio > 95% after ~3 iterations (“iteration” = NDAR iteration which involves many sub -routines including multiple runs of QAOA).
– Solution quality competitive with classical analog (i.e. the same algorithm, with samples obtained classically)
– High approximation ratio on extremely large, difficult graphs
– Result shown for classical analog of proposed approach (Figure 2).  
– Heavy reliance on classical pre and post-processing
– No wall-clock runtime reported
– Subproblem size limited by available qubits
– Results shown required a large amount of QPU time, and many samples
4Solving non-native combinatorial optimization problems using hybrid quantum-classical algorithms– Demon- strated on Max k-Cut and Maximum Independent Set
– Max problem size of 16 nodes (Max k-cut)
– Neutral-atom quantum computer
– Hybrid quantum-classical non-native variational ansatz – Gate count not reported
– Hybrid workflow, runtime dominated by classical refinement
– No explicit runtime or iteration count
– Demon- strates “comparative advantage” over classical-only version
– Approx. ratio same as classical at low depths, and exceeds classical at ~p=5.
– Solves problems that do not map to quantum computers very well (MIS and Max k-Cut).– Small-scale demonstrations
– Lacks direct, comprehensive benchmarking against state-of-the-art classical or quantum methods
5Computational supremacy in quantum simulation– Quantum simulation of 2D/3D spin glasses and infinite-dimensional models
– Square, cubic, diamond, biclique lattices, up to length L=8 (~500 logical qubits), ~5,000 physical qubits.
– Supercon- ducting quantum annealer
– Number of physical qubits per logical qubit is high
– Exponential speedup claimed for quantum simulation
– Not focused on optimization runtime
– Agreement with Schrödinger equation solutions
– Stretched-exponential scaling for classical simulation
– Bench- marked against MPS, which is state of the art for quantum simulation  – No simulations done for real materials
– only for random spin glasses.
6Performant near-term quantum combinatorial optimization– MAXCUT: up to 20 qubits
– Larger sizes in noiseless simulation
– Trapped-ion quantum computer (32 qubits)
– Linear-depth variational ansatz
– 2-qubit gate count scales  O(n^2)
– Conver- gence in “majority of cases,” but no explicit wall-clock time– Avg. Approx. ratio is < 10^-3 after at most 40 iterations.

– Resource-efficient
– Both lower gate counts and number of iterations, vs  QAOA, at same approx. ratio
– Only compares with vanilla QAOA + relax-and-round, and basic classical ADAM optimization algorithm (neither is state-of-the-art)
– Max problem size is only 20 qubits
– No wall-clock time

Note that the study done by D-Wave was intended for simulation, but it can still be used for optimization problems, so it is still included.  It is for simulation, so its metrics are different from the other papers in this table.  It uses correlation error, and spin order, rather than approximation ratio.  This makes it a little difficult to compare it to other results in this table.

Superlatives

Among these six studies, the largest problem instances are tackled in the USRA & Rigetti work, which solves fully connected Sherrington–Kirkpatrick graphs up to 27,000 nodes via a multilevel decomposition and 82‑qubit subproblems. For runtime performance, KipuQuantum’s bias‑field digitized counterdiabatic quantum optimization (BF‑DCQO) demonstrates the fastest time‑to‑approximate-solution on IBM’s 156‑qubit devices, showing a clear quantum speedup over classical heuristics. In terms of resource efficiency, IonQ’s trapped‑ion implementation runs linear‑depth circuits on up to 32 qubits, achieving high approximation ratios with fewer gates and iterations compared to traditional QAOA variants.

Conclusion

Near‑term quantum processors have begun to deliver tangible results on benchmark optimization tasks when paired with tailored algorithms and classical post‑processing. While none of these demonstrations yet displace top classical solvers across all metrics, they highlight clear trade‑offs: USRA et al. push problem scale via decomposition, KipuQuantum excels in runtimes, and IonQ prioritizes circuit efficiency. As hardware improves and algorithms mature, hybrid quantum‑classical workflows may unlock practical advantages for specialized problem classes.

About the Author
Jordan is an expert in quantum algorithms for optimization with a strong background in both academic research and industry applications. He spent several years as a research assistant at the University of Southern California, where he implemented quantum algorithms for optimization. In industry, he has tackled classical optimization problems across diverse sectors, including energy, healthcare, and cybersecurity. Jordan has contributed to several successful startups and holds multiple patents in control optimization for energy systems. He holds a Master’s degree in Systems Engineering from UC Berkeley and a Master’s in Applied Mathematics from the University of Washington, Seattle. He may be reached at makansij@gmail.com.

June 11, 2025