Automated Natural Language Explanation of Deep Visual Neurons with Large Models
This addresses the need for scalable neuron interpretation in deep learning, offering an automated solution that reduces reliance on human effort, though it is incremental as it builds on existing post-hoc methods.
The paper tackles the problem of interpreting deep visual neurons by proposing an automated framework using large foundation models to generate semantic explanations without human intervention, achieving scalability and compatibility across various architectures and datasets.
Deep neural networks have exhibited remarkable performance across a wide range of real-world tasks. However, comprehending the underlying reasons for their effectiveness remains a challenging problem. Interpreting deep neural networks through examining neurons offers distinct advantages when it comes to exploring the inner workings of neural networks. Previous research has indicated that specific neurons within deep vision networks possess semantic meaning and play pivotal roles in model performance. Nonetheless, the current methods for generating neuron semantics heavily rely on human intervention, which hampers their scalability and applicability. To address this limitation, this paper proposes a novel post-hoc framework for generating semantic explanations of neurons with large foundation models, without requiring human intervention or prior knowledge. Our framework is designed to be compatible with various model architectures and datasets, facilitating automated and scalable neuron interpretation. Experiments are conducted with both qualitative and quantitative analysis to verify the effectiveness of our proposed approach.