LGAug 23, 2024

NeurCAM: Interpretable Neural Clustering via Additive Models

arXiv:2408.13361v13 citationsh-index: 2
Originality Highly original
AI Analysis

This addresses the need for interpretable clustering in knowledge discovery and pattern recognition, particularly for complex problems where traditional tree-based methods become less interpretable.

The authors tackled the problem of interpretable clustering by introducing NeurCAM, which uses neural generalized additive models to provide fuzzy cluster membership with additive explanations, achieving performance comparable to black-box methods on tabular data and significantly outperforming other interpretable approaches on text data.

Interpretable clustering algorithms aim to group similar data points while explaining the obtained groups to support knowledge discovery and pattern recognition tasks. While most approaches to interpretable clustering construct clusters using decision trees, the interpretability of trees often deteriorates on complex problems where large trees are required. In this work, we introduce the Neural Clustering Additive Model (NeurCAM), a novel approach to the interpretable clustering problem that leverages neural generalized additive models to provide fuzzy cluster membership with additive explanations of the obtained clusters. To promote sparsity in our model's explanations, we introduce selection gates that explicitly limit the number of features and pairwise interactions leveraged. Additionally, we demonstrate the capacity of our model to perform text clustering that considers the contextual representation of the texts while providing explanations for the obtained clusters based on uni- or bi-word terms. Extensive experiments show that NeurCAM achieves performance comparable to black-box methods on tabular datasets while remaining interpretable. Additionally, our approach significantly outperforms other interpretable clustering approaches when clustering on text data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes