QUANT-PH AISep 26, 2025

Optimizing the non-Clifford-count in unitary synthesis using Reinforcement Learning

David Kremer, Ali Javadi-Abhari, Priyanka Mukhopadhyay

arXiv:2509.21709v16 citationsh-index: 26

Originality Highly original

AI Analysis

This work addresses the challenge of efficiently implementing quantum algorithms for quantum computing practitioners, offering incremental improvements in scalability and success rates over existing methods.

The paper tackles the problem of synthesizing quantum circuits for unitary operators by using reinforcement learning to optimize the non-Clifford-count (T-count and CS-count), achieving close-to-optimal decompositions for up to 100 T gates on two qubits, which is 5 times more than previous RL algorithms and the largest instances reported to date.

An efficient implementation of unitary operators is important in order to practically realize the computational advantages claimed by quantum algorithms over their classical counterparts. In this paper we study the potential of using reinforcement learning (RL) in order to synthesize quantum circuits, while optimizing the T-count and CS-count, of unitaries that are exactly implementable by the Clifford+T and Clifford+CS gate sets, respectively. In general, the complexity of existing algorithms depend exponentially on the number of qubits and the non-Clifford-count of unitaries. We have designed our RL framework to work with channel representation of unitaries, that enables us to perform matrix operations efficiently, using integers only. We have also incorporated pruning heuristics and a canonicalization of operators, in order to reduce the search complexity. As a result, compared to previous works, we are able to implement significantly larger unitaries, in less time, with much better success rate and improvement factor. Our results for Clifford+T synthesis on two qubits achieve close-to-optimal decompositions for up to 100 T gates, 5 times more than previous RL algorithms and to the best of our knowledge, the largest instances achieved with any method to date. Our RL algorithm is able to recover previously-known optimal linear complexity algorithm for T-count-optimal decomposition of 1 qubit unitaries. For 2-qubit Clifford+CS unitaries, our algorithm achieves a linear complexity, something that could only be accomplished by a previous algorithm using $SO(6)$ representation.

View on arXiv PDF

Similar