LGCVAug 12, 2024

Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision~Boundary

arXiv:2408.06302v13 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses interpretability for developers and users of deep learning systems, though it appears incremental as it builds on existing decision boundary analysis methods.

The paper tackles the problem of understanding decision-making in deep binary classifiers by selecting representative samples from decision boundaries and applying post-model explanation algorithms, resulting in distinct clusters and diverse prototypes that capture features leading to low-confidence decisions.

The increasing use of deep learning across various domains highlights the importance of understanding the decision-making processes of these black-box models. Recent research focusing on the decision boundaries of deep classifiers, relies on generated synthetic instances in areas of low confidence, uncovering samples that challenge both models and humans. We propose a novel approach to enhance the interpretability of deep binary classifiers by selecting representative samples from the decision boundary - prototypes - and applying post-model explanation algorithms. We evaluate the effectiveness of our approach through 2D visualizations and GradientSHAP analysis. Our experiments demonstrate the potential of the proposed method, revealing distinct and compact clusters and diverse prototypes that capture essential features that lead to low-confidence decisions. By offering a more aggregated view of deep classifiers' decision boundaries, our work contributes to the responsible development and deployment of reliable machine learning systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes