LGMLNov 18, 2018

Understanding Learned Models by Identifying Important Features at the Right Resolution

arXiv:1811.07279v23 citations
Originality Incremental advance
AI Analysis

This work addresses the need for interpretability in machine learning, particularly in high-stakes domains like biomedicine, by providing a rigorous, model-agnostic tool for feature analysis, though it is incremental as it builds on existing interpretability methods.

The paper tackles the problem of interpreting complex learned models by identifying important features and interactions at appropriate resolutions, presenting a model-agnostic method that uses hypothesis testing and hierarchical control to rigorously assess feature importance, and evaluates it on random forest and LSTM models in biomedical applications.

In many application domains, it is important to characterize how complex learned models make their decisions across the distribution of instances. One way to do this is to identify the features and interactions among them that contribute to a model's predictive accuracy. We present a model-agnostic approach to this task that makes the following specific contributions. Our approach (i) tests feature groups, in addition to base features, and tries to determine the level of resolution at which important features can be determined, (ii) uses hypothesis testing to rigorously assess the effect of each feature on the model's loss, (iii) employs a hierarchical approach to control the false discovery rate when testing feature groups and individual base features for importance, and (iv) uses hypothesis testing to identify important interactions among features and feature groups. We evaluate our approach by analyzing random forest and LSTM neural network models learned in two challenging biomedical applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes