LGAIMLJun 2, 2025

Towards Better Generalization and Interpretability in Unsupervised Concept-Based Models

arXiv:2506.02092v11 citationsh-index: 10ECML/PKDD
Originality Incremental advance
AI Analysis

This work addresses the need for more interpretable and generalizable AI models in image classification, though it appears incremental as it builds on existing concept-based approaches.

The paper tackles the problem of improving trustworthiness in deep neural networks by introducing an unsupervised concept-based model for image classification, which achieves better generalization than existing unsupervised models and nearly matches black-box performance while enhancing interpretability.

To increase the trustworthiness of deep neural networks, it is critical to improve the understanding of how they make decisions. This paper introduces a novel unsupervised concept-based model for image classification, named Learnable Concept-Based Model (LCBM) which models concepts as random variables within a Bernoulli latent space. Unlike traditional methods that either require extensive human supervision or suffer from limited scalability, our approach employs a reduced number of concepts without sacrificing performance. We demonstrate that LCBM surpasses existing unsupervised concept-based models in generalization capability and nearly matches the performance of black-box models. The proposed concept representation enhances information retention and aligns more closely with human understanding. A user study demonstrates the discovered concepts are also more intuitive for humans to interpret. Finally, despite the use of concept embeddings, we maintain model interpretability by means of a local linear combination of concepts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes