ML LGJun 29, 2025

AICO: Feature Significance Tests for Supervised Learning

Kay Giesecke, Enguerrand Horel, Chartsiri Jirachotkulthorn

arXiv:2506.23396v44.5h-index: 6

Originality Highly original

AI Analysis

This addresses the need for transparency and trust in machine learning for researchers, practitioners, and policymakers, offering a practical solution to a known bottleneck.

The paper tackles the problem of identifying which input features truly drive predictions in supervised learning models, introducing AICO, a framework that provides exact statistical tests for feature significance without retraining, achieving reliable results in applications like credit scoring and mortgage-behavior prediction.

Machine learning has become a central tool across scientific, industrial, and policy domains. Algorithms now identify chemical properties, forecast disease risk, screen borrowers, and guide public interventions. Yet this predictive power often comes at the cost of transparency: we rarely know which input features truly drive a model's predictions. Without such understanding, researchers cannot draw reliable scientific conclusions, practitioners cannot ensure fairness or accountability, and policy makers cannot trust or govern model-based decisions. Despite its importance, existing tools for assessing feature influence are limited -- most lack statistical guarantees, and many require costly retraining or surrogate modeling, making them impractical for large modern models. We introduce AICO, a broadly applicable framework that turns model interpretability into an efficient statistical exercise. AICO asks, for any trained regression or classification model, whether each feature genuinely improves model performance. It does so by masking the feature's information and measuring the resulting change in performance. The method delivers exact, finite-sample inference -- exact feature p-values and confidence intervals -- without any retraining, surrogate modeling, or distributional assumptions, making it feasible for today's large-scale algorithms. In both controlled experiments and real applications -- from credit scoring to mortgage-behavior prediction -- AICO consistently pinpoints the variables that drive model behavior, providing a fast and reliable path toward transparent and trustworthy machine learning.

View on arXiv PDF

Similar