Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations
This addresses the problem of interpretability in deep learning for researchers and practitioners, though it appears incremental as it builds on existing circuit discovery approaches.
The paper tackles the challenge of pinpointing where specific visual concepts are encoded in deep vision models by introducing Granular Concept Circuit (GCC), a method that discovers circuits representing concepts relevant to a query, validated across various models.
Deep vision models have achieved remarkable classification performance by leveraging a hierarchical architecture in which human-interpretable concepts emerge through the composition of individual neurons across layers. Given the distributed nature of representations, pinpointing where specific visual concepts are encoded within a model remains a crucial yet challenging task. In this paper, we introduce an effective circuit discovery method, called Granular Concept Circuit (GCC), in which each circuit represents a concept relevant to a given query. To construct each circuit, our method iteratively assesses inter-neuron connectivity, focusing on both functional dependencies and semantic alignment. By automatically discovering multiple circuits, each capturing specific concepts within that query, our approach offers a profound, concept-wise interpretation of models and is the first to identify circuits tied to specific visual concepts at a fine-grained level. We validate the versatility and effectiveness of GCCs across various deep image classification models.