MEAISPMLAug 17, 2023

Spectral information criterion for automatic elbow detection

arXiv:2308.09108v115 citationsh-index: 32
Originality Incremental advance
AI Analysis

This is an incremental improvement for researchers and practitioners in machine learning and statistics who need automated model selection tools.

The authors tackled the problem of automatically detecting the 'elbow' in error curves for model selection by introducing the Spectral Information Criterion (SIC), which generalizes existing criteria like BIC and AIC and does not strictly require likelihood functions, and the results showed it provides a subset of models with much smaller cardinality and performs well in synthetic and real-world applications such as clustering and variable selection.

We introduce a generalized information criterion that contains other well-known information criteria, such as Bayesian information Criterion (BIC) and Akaike information criterion (AIC), as special cases. Furthermore, the proposed spectral information criterion (SIC) is also more general than the other information criteria, e.g., since the knowledge of a likelihood function is not strictly required. SIC extracts geometric features of the error curve and, as a consequence, it can be considered an automatic elbow detector. SIC provides a subset of all possible models, with a cardinality that often is much smaller than the total number of possible models. The elements of this subset are elbows of the error curve. A practical rule for selecting a unique model within the sets of elbows is suggested as well. Theoretical invariance properties of SIC are analyzed. Moreover, we test SIC in ideal scenarios where provides always the optimal expected results. We also test SIC in several numerical experiments: some involving synthetic data, and two experiments involving real datasets. They are all real-world applications such as clustering, variable selection, or polynomial order selection, to name a few. The results show the benefits of the proposed scheme. Matlab code related to the experiments is also provided. Possible future research lines are finally discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes