Enumerating the k-fold configurations in multi-class classification problems
This addresses the reproducibility crisis in AI for researchers and practitioners by providing tools to verify experimental setups, though it is incremental as it builds on prior work for binary classification.
The paper tackles the irreproducibility of k-fold cross-validation scores in AI by developing a method to test consistency, which requires enumerating all k-fold configurations, and they proposed an algorithm for this in binary classification cases.
K-fold cross-validation is a widely used tool for assessing classifier performance. The reproducibility crisis faced by artificial intelligence partly results from the irreproducibility of reported k-fold cross-validation-based performance scores. Recently, we introduced numerical techniques to test the consistency of claimed performance scores and experimental setups. In a crucial use case, the method relies on the combinatorial enumeration of all k-fold configurations, for which we proposed an algorithm in the binary classification case.