Explaining Deep Learning Models - A Bayesian Non-parametric Approach
This work addresses the challenge of interpreting machine learning models for users, offering a novel method to inspect models as complete entities, though it appears incremental in the context of existing explanation techniques.
The authors tackled the problem of explaining deep learning models by proposing a Bayesian non-parametric regression mixture model with multiple elastic nets, which outperformed state-of-the-art techniques in explaining individual decisions and enabled vulnerability discovery in image recognition tasks.
Understanding and interpreting how machine learning (ML) models make decisions have been a big challenge. While recent research has proposed various technical approaches to provide some clues as to how an ML model makes individual predictions, they cannot provide users with an ability to inspect a model as a complete entity. In this work, we propose a novel technical approach that augments a Bayesian non-parametric regression mixture model with multiple elastic nets. Using the enhanced mixture model, we can extract generalizable insights for a target model through a global approximation. To demonstrate the utility of our approach, we evaluate it on different ML models in the context of image recognition. The empirical results indicate that our proposed approach not only outperforms the state-of-the-art techniques in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of the target ML models.