On the Computational Power of Online Gradient Descent
This work addresses the fundamental computational limits of analyzing gradient descent for researchers in machine learning theory and optimization.
The paper demonstrates that online gradient descent can simulate arbitrary polynomial-space computations, even in simple learning settings, and shows that efficiently analyzing its fine-grained behavior is impossible under weak complexity-theoretic assumptions.
We prove that the evolution of weight vectors in online gradient descent can encode arbitrary polynomial-space computations, even in very simple learning settings. Our results imply that, under weak complexity-theoretic assumptions, it is impossible to reason efficiently about the fine-grained behavior of online gradient descent.