Accelerating Inference for Multilayer Neural Networks with Quantum Computers
This work addresses the challenge of accelerating neural network inference for AI researchers and practitioners by leveraging quantum computing, though it is incremental as it builds on existing quantum and classical methods.
The authors tackled the problem of integrating quantum computers into deep learning pipelines by presenting the first fully-coherent quantum implementation of a multilayer neural network with non-linear activations, achieving quadratic to quartic speedups over classical methods and proving an inference cost of O(polylog(N/ε)^k) under certain quantum data access regimes.
Fault-tolerant Quantum Processing Units (QPUs) promise to deliver exponential speed-ups in select computational tasks, yet their integration into modern deep learning pipelines remains unclear. In this work, we take a step towards bridging this gap by presenting the first fully-coherent quantum implementation of a multilayer neural network with non-linear activation functions. Our constructions mirror widely used deep learning architectures based on ResNet, and consist of residual blocks with multi-filter 2D convolutions, sigmoid activations, skip-connections, and layer normalizations. We analyse the complexity of inference for networks under three quantum data access regimes. Without any assumptions, we establish a quadratic speedup over classical methods for shallow bilinear-style networks. With efficient quantum access to the weights, we obtain a quartic speedup over classical methods. With efficient quantum access to both the inputs and the network weights, we prove that a network with an $N$-dimensional vectorized input, $k$ residual block layers, and a final residual-linear-pooling layer can be implemented with an error of $ε$ with $O(\text{polylog}(N/ε)^k)$ inference cost.