LGAISTFeb 8, 2019

Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance

arXiv:1902.03046v356 citations
AI Analysis

This work addresses convergence speed for machine learning practitioners using regularized convex methods, offering theoretical improvements but is incremental as it extends existing least-squares analysis to a broader loss class.

The paper tackles the problem of slow convergence rates in regularized empirical risk minimization by assuming self-concordant losses, which include least-squares and generalized linear models, and shows that adapting source and capacity conditions leads to fast non-asymptotic rates, improving bias and variance terms beyond the generic O(1/√n) rate.

We consider learning methods based on the regularization of a convex empirical risk by a squared Hilbertian norm, a setting that includes linear predictors and non-linear predictors through positive-definite kernels. In order to go beyond the generic analysis leading to convergence rates of the excess risk as $O(1/\sqrt{n})$ from $n$ observations, we assume that the individual losses are self-concordant, that is, their third-order derivatives are bounded by their second-order derivatives. This setting includes least-squares, as well as all generalized linear models such as logistic and softmax regression. For this class of losses, we provide a bias-variance decomposition and show that the assumptions commonly made in least-squares regression, such as the source and capacity conditions, can be adapted to obtain fast non-asymptotic rates of convergence by improving the bias terms, the variance terms or both.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes