A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines
This work addresses the need for more reliable and robust interpretations of deep learning models, which is crucial for wider acceptance in applications, though it appears incremental as it builds on existing ensemble and RBM techniques.
The paper tackles the problem of divergent and conflicting explanations from different feature attribution methods for deep neural networks by proposing an aggregation technique using Restricted Boltzmann Machines (RBMs), resulting in a method that outperforms popular attribution methods and basic ensembles on real-world datasets.
Understanding the results of deep neural networks is an essential step towards wider acceptance of deep learning algorithms. Many approaches address the issue of interpreting artificial neural networks, but often provide divergent explanations. Moreover, different hyperparameters of an explanatory method can lead to conflicting interpretations. In this paper, we propose a technique for aggregating the feature attributions of different explanatory algorithms using Restricted Boltzmann Machines (RBMs) to achieve a more reliable and robust interpretation of deep neural networks. Several challenging experiments on real-world datasets show that the proposed RBM method outperforms popular feature attribution methods and basic ensemble techniques.