LGMLFeb 23, 2022

Fast Sparse Classification for Generalized Linear and Additive Models

arXiv:2202.11389v224 citations
AI Analysis

This work addresses the need for efficient and interpretable classification models in high-dimensional data settings, though it is incremental in nature.

The paper tackles the problem of slow classification for sparse generalized linear and additive models, achieving algorithms that are 2 to 5 times faster than previous approaches while maintaining comparable accuracy to black-box models.

We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes