AGGLIO: Global Optimization for Locally Convex Functions
This addresses optimization challenges in machine learning for non-convex objectives, though it appears incremental as it builds on graduated optimization techniques.
The paper tackles the problem of non-convex optimization with locally convex functions, such as those arising from activation functions like sigmoid and SiLU, by proposing AGGLIO, which offers global convergence guarantees and outperforms recent methods in convergence rate and accuracy.
This paper presents AGGLIO (Accelerated Graduated Generalized LInear-model Optimization), a stage-wise, graduated optimization technique that offers global convergence guarantees for non-convex optimization problems whose objectives offer only local convexity and may fail to be even quasi-convex at a global scale. In particular, this includes learning problems that utilize popular activation functions such as sigmoid, softplus and SiLU that yield non-convex training objectives. AGGLIO can be readily implemented using point as well as mini-batch SGD updates and offers provable convergence to the global optimum in general conditions. In experiments, AGGLIO outperformed several recently proposed optimization techniques for non-convex and locally convex objectives in terms of convergence rate as well as convergent accuracy. AGGLIO relies on a graduation technique for generalized linear models, as well as a novel proof strategy, both of which may be of independent interest.