LGSep 7, 2022

Seeking Interpretability and Explainability in Binary Activated Neural Networks

arXiv:2209.03450v34 citationsh-index: 19
AI Analysis

This work addresses interpretability for users of neural networks in tabular data regression, though it appears incremental as it builds on existing binary activation and SHAP methods.

The paper tackles the problem of interpretability in neural networks for regression on tabular data by using binary activated networks, providing expressiveness guarantees and a greedy algorithm to build compact models without pre-fixed architectures.

We study the use of binary activated neural networks as interpretable and explainable predictors in the context of regression tasks on tabular data; more specifically, we provide guarantees on their expressiveness, present an approach based on the efficient computation of SHAP values for quantifying the relative importance of the features, hidden neurons and even weights. As the model's simplicity is instrumental in achieving interpretability, we propose a greedy algorithm for building compact binary activated networks. This approach doesn't need to fix an architecture for the network in advance: it is built one layer at a time, one neuron at a time, leading to predictors that aren't needlessly complex for a given task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes