QUANT-PHLGMar 6, 2023

Towards provably efficient quantum algorithms for large-scale machine-learning models

arXiv:2303.03428v5100 citationsh-index: 34
AI Analysis

This work addresses the problem of high computational costs in training large AI models, potentially benefiting the machine learning community, though it is incremental as it builds on earlier quantum algorithms for differential equations.

The paper tackles the computational bottlenecks in large-scale machine learning models by proposing that fault-tolerant quantum computing can provide provably efficient resolutions for gradient descent algorithms, scaling as O(T^2 polylog(n)), with benchmarks on models from 7 million to 103 million parameters showing quantum enhancement in sparse training.

Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as O(T^2 polylog(n)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes