MLLGOCMay 21, 2018

Stochastic Gradient Descent for Stochastic Doubly-Nonconvex Composite Optimization

arXiv:1805.07960v22 citations
AI Analysis

This work addresses a theoretical gap in optimization for machine learning by providing convergence guarantees for a previously unsolved doubly-nonconvex case, which is incremental but important for nonconvex models.

The paper tackles the problem of stochastic gradient descent for doubly-nonconvex composite optimization, where both composite functions are nonconvex, by assuming a quasiconvex penalty condition to derive convergence properties and an optimal convergence rate that outperforms existing methods.

The stochastic gradient descent has been widely used for solving composite optimization problems in big data analyses. Many algorithms and convergence properties have been developed. The composite functions were convex primarily and gradually nonconvex composite functions have been adopted to obtain more desirable properties. The convergence properties have been investigated, but only when either of composite functions is nonconvex. There is no convergence property when both composite functions are nonconvex, which is named the \textit{doubly-nonconvex} case.To overcome this difficulty, we assume a simple and weak condition that the penalty function is \textit{quasiconvex} and then we obtain convergence properties for the stochastic doubly-nonconvex composite optimization problem.The convergence rate obtained here is of the same order as the existing work.We deeply analyze the convergence rate with the constant step size and mini-batch size and give the optimal convergence rate with appropriate sizes, which is superior to the existing work. Experimental results illustrate that our method is superior to existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes