Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches
This work addresses the need for a comparative analysis in interpretable AI, but it is incremental as it reviews and contrasts existing fields without introducing new methods.
The paper systematically compares concept-based explanations and disentanglement approaches for extracting human-interpretable representations from deep models, highlighting that state-of-the-art methods from both classes can be data inefficient, task-sensitive, or representation-sensitive.
Concept-based explanations have emerged as a popular way of extracting human-interpretable representations from deep discriminative models. At the same time, the disentanglement learning literature has focused on extracting similar representations in an unsupervised or weakly-supervised way, using deep generative models. Despite the overlapping goals and potential synergies, to our knowledge, there has not yet been a systematic comparison of the limitations and trade-offs between concept-based explanations and disentanglement approaches. In this paper, we give an overview of these fields, comparing and contrasting their properties and behaviours on a diverse set of tasks, and highlighting their potential strengths and limitations. In particular, we demonstrate that state-of-the-art approaches from both classes can be data inefficient, sensitive to the specific nature of the classification/regression task, or sensitive to the employed concept representation.