Tensor-Parallel Emulation of Quantum Circuits with Block-Cyclic Distributed Matrix Product States
This addresses the gap in distributed-memory tensor network methods for quantum circuit emulation, offering a scalable solution that could enhance algorithms based on dense tensor networks, though it appears incremental in its method improvements.
The paper tackles the problem of emulating quantum circuits using tensor networks by introducing a tensor-parallel distribution scheme for matrix product states, achieving a bond dimension of 16,384 and surpassing state-of-the-art accuracy by three orders of magnitude on 32 nodes.
Tensor networks establish an adaptable framework for the emulation of quantum circuits. By partitioning exponentially large registers and gates into smaller tensors, this unlocks fast transformations through tensor algebra, and grants fine control over memory, runtime and accuracy. Due to inherently lower spatial footprint, there is a gap in distributed-memory tensor network methods. While certain parallel techniques exist, they are usually limited to direct contraction and sampling problems, and a more general approach is needed for tensor representations like matrix product states (MPS), which efficiently approximate full quantum state evolution. In this study, we expand the MPS site tensors beyond local memory by introducing a tensor-parallel distribution scheme, where individual dense tensors are evenly scattered across a subset of indices. This is further facilitated by leveraging pivoted QR factorisation instead of slower singular value decomposition (SVD). We demonstrate the capabilities of our approach by approximately emulating the classically difficult Google's random circuit sampling (RCS) benchmark. The highest bond dimensions of 16,384 is reached, surpassing the accuracy of the state-of-the-art methods by three orders of magnitude on 32 nodes of ARCHER2. We also show how this helps advance experiments involving more practical quantum phase estimation circuits. Our approach has the potential to enhance numerous algorithms based on dense tensor networks, offering a scalable and naturally load-balanced distribution formula. It is also compatible with other types of parallelism, unlocking new opportunities to push the quantum-classical computational phase boundary.