LGOCJan 18, 2021

Screening for Sparse Online Learning

arXiv:2101.06982v24 citations
AI Analysis

This work addresses computational inefficiencies in sparse online learning for machine learning practitioners, though it is incremental as it builds on existing algorithms.

The paper tackles the problem of online learning algorithms lacking finite activity identification for sparsity-promoting regularizers by introducing a screening rule to eliminate useless features, resulting in significant computational acceleration.

Sparsity promoting regularizers are widely used to impose low-complexity structure (e.g. l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit "finite activity identification", namely, they can identify the low-complexity structure in a finite number of iterations. However, most online algorithms (such as proximal stochastic gradient descent) do not have the property owing to the vanishing step-size and non-vanishing variance. In this paper, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification. One consequence is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited for computational gains. Numerically, significant acceleration can be obtained.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes