A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming
This is an incremental improvement for optimization in machine learning, offering better performance on specific problems like ℓ₁-regularized least-squares and dual SVM.
The authors tackled the problem of minimizing the sum of smooth and block-separable nonsmooth functions by proposing a randomized nonmonotone block proximal gradient method, which they showed finds stationary points almost surely and outperforms existing randomized block coordinate descent methods in computational experiments.
We propose a randomized nonmonotone block proximal gradient (RNBPG) method for minimizing the sum of a smooth (possibly nonconvex) function and a block-separable (possibly nonconvex nonsmooth) function. At each iteration, this method randomly picks a block according to any prescribed probability distribution and solves typically several associated proximal subproblems that usually have a closed-form solution, until a certain progress on objective value is achieved. In contrast to the usual randomized block coordinate descent method [23,20], our method has a nonmonotone flavor and uses variable stepsizes that can partially utilize the local curvature information of the smooth component of objective function. We show that any accumulation point of the solution sequence of the method is a stationary point of the problem {\it almost surely} and the method is capable of finding an approximate stationary point with high probability. We also establish a sublinear rate of convergence for the method in terms of the minimal expected squared norm of certain proximal gradients over the iterations. When the problem under consideration is convex, we show that the expected objective values generated by RNBPG converge to the optimal value of the problem. Under some assumptions, we further establish a sublinear and linear rate of convergence on the expected objective values generated by a monotone version of RNBPG. Finally, we conduct some preliminary experiments to test the performance of RNBPG on the $\ell_1$-regularized least-squares problem and a dual SVM problem in machine learning. The computational results demonstrate that our method substantially outperforms the randomized block coordinate {\it descent} method with fixed or variable stepsizes.