CVJun 12, 2024

LLM-assisted Concept Discovery: Automatically Identifying and Explaining Neuron Functions

arXiv:2406.08572v17 citations
Originality Incremental advance
AI Analysis

This provides an automated tool for researchers and practitioners to better understand DNN behavior, though it is incremental by building on prior concept-based explanation methods.

The paper tackles the problem of automatically identifying and explaining neuron functions in deep neural networks by leveraging multimodal large language models for open-ended concept discovery, resulting in more faithful and interpretable concepts without requiring pre-defined sets or manual input.

Providing textual concept-based explanations for neurons in deep neural networks (DNNs) is of importance in understanding how a DNN model works. Prior works have associated concepts with neurons based on examples of concepts or a pre-defined set of concepts, thus limiting possible explanations to what the user expects, especially in discovering new concepts. Furthermore, defining the set of concepts requires manual work from the user, either by directly specifying them or collecting examples. To overcome these, we propose to leverage multimodal large language models for automatic and open-ended concept discovery. We show that, without a restricted set of pre-defined concepts, our method gives rise to novel interpretable concepts that are more faithful to the model's behavior. To quantify this, we validate each concept by generating examples and counterexamples and evaluating the neuron's response on this new set of images. Collectively, our method can discover concepts and simultaneously validate them, providing a credible automated tool to explain deep neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes