LGSPOCFeb 22, 2021

Expanding boundaries of Gap Safe screening

arXiv:2102.10846v216 citations
AI Analysis

This work provides an incremental improvement for researchers and practitioners in fields like statistics and machine learning by enhancing safe screening techniques to handle a wider class of optimization problems.

The paper tackles the problem of accelerating sparse optimization algorithms by extending the Gap Safe screening framework to relax the global strong-concavity assumption, using local regularity properties and integrating non-negativity constraints, which broadens applicability to functions like beta-divergences and improves performance in cases like logistic regression.

Sparse optimization problems are ubiquitous in many fields such as statistics, signal/image processing and machine learning. This has led to the birth of many iterative algorithms to solve them. A powerful strategy to boost the performance of these algorithms is known as safe screening: it allows the early identification of zero coordinates in the solution, which can then be eliminated to reduce the problem's size and accelerate convergence. In this work, we extend the existing Gap Safe screening framework by relaxing the global strong-concavity assumption on the dual cost function. Instead, we exploit local regularity properties, that is, strong concavity on well-chosen subsets of the domain. The non-negativity constraint is also integrated to the existing framework. Besides making safe screening possible to a broader class of functions that includes beta-divergences (e.g., the Kullback-Leibler divergence), the proposed approach also improves upon the existing Gap Safe screening rules on previously applicable cases (e.g., logistic regression). The proposed general framework is exemplified by some notable particular cases: logistic function, beta = 1.5 and Kullback-Leibler divergences. Finally, we showcase the effectiveness of the proposed screening rules with different solvers (coordinate descent, multiplicative-update and proximal gradient algorithms) and different data sets (binary classification, hyperspectral and count data).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes