ML LGJun 22, 2022

Diagnostic Tool for Out-of-Sample Model Evaluation

Ludvig Hult, Dave Zachariah, Petre Stoica

arXiv:2206.10982v32.11 citationsh-index: 108Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of reliable model evaluation for practitioners, offering a practical tool with theoretical guarantees, though it is incremental as it builds on existing calibration methods.

The paper tackles the problem of evaluating model performance on future data by proposing a diagnostic tool that uses a finite calibration dataset to characterize out-of-sample losses, providing finite-sample guarantees under weak assumptions and demonstrating its utility in quantifying distribution shifts and aiding model selection.

Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn models by minimizing a chosen loss function averaged over training data, with the aim of achieving small losses on future data. In this paper, we consider the use of a finite calibration data set to characterize the future, out-of-sample losses of a model. We propose a simple model diagnostic tool that provides finite-sample guarantees under weak assumptions. The tool is simple to compute and to interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyper-parameter tuning.

View on arXiv PDF Code

Similar