LG PR MLApr 27, 2024

On the Rashomon ratio of infinite hypothesis sets

arXiv:2404.17746v17.92 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the challenge of model selection and generalization in machine learning for researchers and practitioners, but it is incremental as it extends prior finite-family results to infinite families.

The paper tackles the problem of quantifying the Rashomon ratio for infinite classifier families, showing that a large ratio ensures selecting a classifier with good empirical accuracy from a random subset does not significantly increase empirical loss, and provides estimation methods with guarantees.

Given a classification problem and a family of classifiers, the Rashomon ratio measures the proportion of classifiers that yield less than a given loss. Previous work has explored the advantage of a large Rashomon ratio in the case of a finite family of classifiers. Here we consider the more general case of an infinite family. We show that a large Rashomon ratio guarantees that choosing the classifier with the best empirical accuracy among a random subset of the family, which is likely to improve generalizability, will not increase the empirical loss too much. We quantify the Rashomon ratio in two examples involving infinite classifier families in order to illustrate situations in which it is large. In the first example, we estimate the Rashomon ratio of the classification of normally distributed classes using an affine classifier. In the second, we obtain a lower bound for the Rashomon ratio of a classification problem with a modified Gram matrix when the classifier family consists of two-layer ReLU neural networks. In general, we show that the Rashomon ratio can be estimated using a training dataset along with random samples from the classifier family and we provide guarantees that such an estimation is close to the true value of the Rashomon ratio.

View on arXiv PDF

Similar