MLDec 3, 2018Code
Sensitivity based Neural Networks ExplanationsEnguerrand Horel, Virgile Mison, Tao Xiong et al.
Although neural networks can achieve very high predictive performance on various different tasks such as image recognition or natural language processing, they are often considered as opaque "black boxes". The difficulty of interpreting the predictions of a neural network often prevents its use in fields where explainability is important, such as the financial industry where regulators and auditors often insist on this aspect. In this paper, we present a way to assess the relative input features importance of a neural network based on the sensitivity of the model output with respect to its input. This method has the advantage of being fast to compute, it can provide both global and local levels of explanations and is applicable for many types of neural network architectures. We illustrate the performance of this method on both synthetic and real data and compare it with other interpretation techniques. This method is implemented into an open-source Python package that allows its users to easily generate and visualize explanations for their neural networks.
MLJun 29, 2025
AICO: Feature Significance Tests for Supervised LearningKay Giesecke, Enguerrand Horel, Chartsiri Jirachotkulthorn
Machine learning has become a central tool across scientific, industrial, and policy domains. Algorithms now identify chemical properties, forecast disease risk, screen borrowers, and guide public interventions. Yet this predictive power often comes at the cost of transparency: we rarely know which input features truly drive a model's predictions. Without such understanding, researchers cannot draw reliable scientific conclusions, practitioners cannot ensure fairness or accountability, and policy makers cannot trust or govern model-based decisions. Despite its importance, existing tools for assessing feature influence are limited -- most lack statistical guarantees, and many require costly retraining or surrogate modeling, making them impractical for large modern models. We introduce AICO, a broadly applicable framework that turns model interpretability into an efficient statistical exercise. AICO asks, for any trained regression or classification model, whether each feature genuinely improves model performance. It does so by masking the feature's information and measuring the resulting change in performance. The method delivers exact, finite-sample inference -- exact feature p-values and confidence intervals -- without any retraining, surrogate modeling, or distributional assumptions, making it feasible for today's large-scale algorithms. In both controlled experiments and real applications -- from credit scoring to mortgage-behavior prediction -- AICO consistently pinpoints the variables that drive model behavior, providing a fast and reliable path toward transparent and trustworthy machine learning.
MLMay 23, 2019
Computationally Efficient Feature Significance and Importance for Machine Learning ModelsEnguerrand Horel, Kay Giesecke
We develop a simple and computationally efficient significance test for the features of a machine learning model. Our forward-selection approach applies to any model specification, learning task and variable type. The test is non-asymptotic, straightforward to implement, and does not require model refitting. It identifies the statistically significant features as well as feature interactions of any order in a hierarchical manner, and generates a model-free notion of feature importance. Experimental and empirical results illustrate its performance.
STFeb 16, 2019
Significance Tests for Neural NetworksEnguerrand Horel, Kay Giesecke
We develop a pivotal test to assess the statistical significance of the feature variables in a single-layer feedforward neural network regression model. We propose a gradient-based test statistic and study its asymptotics using nonparametric techniques. Under technical conditions, the limiting distribution is given by a mixture of chi-square distributions. The tests enable one to discern the impact of individual variables on the prediction of a neural network. The test statistic can be used to rank variables according to their influence. Simulation results illustrate the computational efficiency and the performance of the test. An empirical application to house price valuation highlights the behavior of the test using actual data.