MLLGCOOct 19, 2021

abess: A Fast Best Subset Selection Library in Python and R

arXiv:2110.09697v234 citationsHas Code
AI Analysis

This provides a fast and reliable tool for variable selection in machine learning, addressing a known bottleneck in model optimization.

The authors introduced abess, a library for best-subset selection in machine learning tasks like regression and classification, which certifiably finds optimal solutions efficiently, achieving speeds up to 20 times faster than existing toolboxes.

We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the abess certifiably gets the optimal solution within polynomial times with high probability under the linear model. Our efficient implementation allows abess to attain the solution of best-subset selection problems as fast as or even 20x faster than existing competing variable (model) selection toolboxes. Furthermore, it supports common variants like best group subset selection and $\ell_2$ regularized best-subset selection. The core of the library is programmed in C++. For ease of use, a Python library is designed for conveniently integrating with scikit-learn, and it can be installed from the Python library Index. In addition, a user-friendly R library is available at the Comprehensive R Archive Network. The source code is available at: https://github.com/abess-team/abess.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes