Automatic Differentiation of Optimization Algorithms with Time-Varying Updates
This work addresses a theoretical gap for researchers in optimization and machine learning by extending differentiation techniques to time-varying algorithms, but it is incremental as it builds on existing methods for specific cases.
The paper tackles the problem of applying automatic differentiation to optimization algorithms with time-varying updates, such as variable step sizes, by providing convergence guarantees for derivative iterates and confirming these with numerical experiments on regularized regression problems, showing that the algorithm's convergence rate is reflected in its derivatives.
Numerous Optimization Algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly smooth problems. We confirm our findings numerically by solving $\ell_1$ and $\ell_2$-regularized linear and logisitc regression respectively. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.