Linear convergence of forward-backward accelerated algorithms without knowledge of the modulus of strong convexity
This resolves a theoretical gap in optimization for practitioners using accelerated methods in fields like image science and engineering, though it is incremental as it builds on existing high-resolution ODE frameworks.
The paper tackles the open problem of whether Nesterov's accelerated gradient descent (NAG) and FISTA exhibit linear convergence for strongly convex functions without prior knowledge of the modulus, and proves they do converge linearly, with the convergence independent of parameter r and also showing linear convergence for the square of the proximal subgradient norm.
A significant milestone in modern gradient-based optimization was achieved with the development of Nesterov's accelerated gradient descent (NAG) method. This forward-backward technique has been further advanced with the introduction of its proximal generalization, commonly known as the fast iterative shrinkage-thresholding algorithm (FISTA), which enjoys widespread application in image science and engineering. Nonetheless, it remains unclear whether both NAG and FISTA exhibit linear convergence for strongly convex functions. Remarkably, these algorithms demonstrate convergence without requiring any prior knowledge of strongly convex modulus, and this intriguing characteristic has been acknowledged as an open problem in the comprehensive review [Chambolle and Pock, 2016, Appendix B]. In this paper, we address this question by utilizing the high-resolution ordinary differential equation (ODE) framework. Expanding upon the established phase-space representation, we emphasize the distinctive approach employed in crafting the Lyapunov function, which involves a dynamically adapting coefficient of kinetic energy that evolves throughout the iterations. Furthermore, we highlight that the linear convergence of both NAG and FISTA is independent of the parameter $r$. Additionally, we demonstrate that the square of the proximal subgradient norm likewise advances towards linear convergence.