LG AI ME MLJul 26, 2025

Irredundant $k$-Fold Cross-Validation

arXiv:2507.20048v2h-index: 31

Originality Incremental advance

AI Analysis

This addresses a methodological problem for machine learning practitioners by providing a more balanced and efficient validation approach, though it is incremental as it builds on existing cross-validation methods.

The paper tackled the redundancy in traditional k-fold cross-validation, where instances are used multiple times, by introducing Irredundant k-fold cross-validation, which ensures each instance is used exactly once for training and once for testing, resulting in less optimistic variance estimates and significantly reduced computational cost.

In traditional k-fold cross-validation, each instance is used ($k-1$) times for training and once for testing, leading to redundancy that lets many instances disproportionately influence the learning phase. We introduce Irredundant $k$-fold cross-validation, a novel method that guarantees each instance is used exactly once for training and once for testing across the entire validation procedure. This approach ensures a more balanced utilization of the dataset, mitigates overfitting due to instance repetition, and enables sharper distinctions in comparative model analysis. The method preserves stratification and remains model-agnostic, i.e., compatible with any classifier. Experimental results demonstrate that it delivers consistent performance estimates across diverse datasets -- comparable to $k$-fold cross-validation -- while providing less optimistic variance estimates because training partitions are non-overlapping, and significantly reducing the overall computational cost.

View on arXiv PDF

Similar