LG AIApr 11, 2023

CGXplain: Rule-Based Deep Neural Network Explanations Using Dual Linear Programs

Konstantin Hemker, Zohreh Shams, Mateja Jamnik

arXiv:2304.05207v17.77 citationsh-index: 26Has Code

Originality Incremental advance

AI Analysis

This work provides more stable and compact explanations for deep learning models, which is useful for users needing interpretable AI, though it is incremental as it builds on decompositional methods.

The paper tackled the problem of generating rule-based explanations for deep neural networks by addressing limitations in alignment, complexity, and stability of existing methods, resulting in a method that reduces rule set size by over 80% while maintaining or improving accuracy and fidelity.

Rule-based surrogate models are an effective and interpretable way to approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans to easily understand deep learning models. Current state-of-the-art decompositional methods, which are those that consider the DNN's latent space to extract more exact rule sets, manage to derive rule sets at high accuracy. However, they a) do not guarantee that the surrogate model has learned from the same variables as the DNN (alignment), b) only allow to optimise for a single objective, such as accuracy, which can result in excessively large rule sets (complexity), and c) use decision tree algorithms as intermediate models, which can result in different explanations for the same DNN (stability). This paper introduces the CGX (Column Generation eXplainer) to address these limitations - a decompositional method using dual linear programming to extract rules from the hidden representations of the DNN. This approach allows to optimise for any number of objectives and empowers users to tweak the explanation model to their needs. We evaluate our results on a wide variety of tasks and show that CGX meets all three criteria, by having exact reproducibility of the explanation model that guarantees stability and reduces the rule set size by >80% (complexity) at equivalent or improved accuracy and fidelity across tasks (alignment).

View on arXiv PDF Code

Similar