LGOCMLMay 25, 2016

On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don't Worry About Its Nonsmooth Loss Function

arXiv:1605.07950v611 citations
Originality Incremental advance
AI Analysis

This work addresses the computational efficiency problem for researchers and practitioners using robust regression methods, though it is incremental in optimizing existing algorithms.

The paper tackles the computational challenge of SQRT-Lasso regression by showing that its nonsmooth loss function does not hinder the use of proximal algorithms, achieving fast convergence guarantees with high probability, as supported by numerical results.

Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility. However, by exploring the modeling structures, we find these "sacrifices" do not always require more computational efforts. To shed light on such a "free-lunch" phenomenon, we study the square-root-Lasso (SQRT-Lasso) type regression problem. Specifically, we show that the nonsmooth loss functions of SQRT-Lasso type regression ease tuning effort and gain adaptivity to inhomogeneous noise, but is not necessarily more challenging than Lasso in computation. We can directly apply proximal algorithms (e.g. proximal gradient descent, proximal Newton, and proximal Quasi-Newton algorithms) without worrying the nonsmoothness of the loss function. Theoretically, we prove that the proximal algorithms combined with the pathwise optimization scheme enjoy fast convergence guarantees with high probability. Numerical results are provided to support our theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes