MLLGOCApr 25, 2024

Automated Model Selection for Generalized Linear Models

arXiv:2404.16560v11 citationsh-index: 1Comput Stat
Originality Incremental advance
AI Analysis

This addresses the challenge of automating model selection for statisticians and data scientists, though it appears incremental as it builds on existing optimization and constraint ideas.

The paper tackles the problem of automated model selection for generalized linear models by using mixed-integer conic optimization to combine feature subset selection with holistic models, directly optimizing for Akaike and Bayesian information criteria while imposing constraints to handle multicollinearity.

In this paper, we show how mixed-integer conic optimization can be used to combine feature subset selection with holistic generalized linear models to fully automate the model selection process. Concretely, we directly optimize for the Akaike and Bayesian information criteria while imposing constraints designed to deal with multicollinearity in the feature selection task. Specifically, we propose a novel pairwise correlation constraint that combines the sign coherence constraint with ideas from classical statistical models like Ridge regression and the OSCAR model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes