NCCVFeb 10, 2025

Deciphering Functions of Neurons in Vision-Language Models

arXiv:2502.18485v42 citationsh-index: 5Has CodeMM
Originality Incremental advance
AI Analysis

This research tackles the problem of interpretability in vision-language models, which is crucial for developing trustworthy AI systems, particularly for applications that rely on these models.

This study investigated the functions of individual neurons in vision-language models, finding that neurons can be categorized into visual, text, and multi-modal neurons. The study developed a framework to automate explanations of neurons and assessed the reliability of these explanations.

The burgeoning growth of open-sourced vision-language models (VLMs) has catalyzed a plethora of applications across diverse domains. Ensuring the transparency and interpretability of these models is critical for fostering trustworthy and responsible AI systems. In this study, our objective is to delve into the internals of VLMs to interpret the functions of individual neurons. We observe the activations of neurons with respects to the input visual tokens and text tokens, and reveal some interesting findings. Particularly, we found that there are neurons responsible for only visual or text information, or both, respectively, which we refer to them as visual neurons, text neurons, and multi-modal neurons, respectively. We build a framework that automates the explanation of neurons with the assistant of GPT-4o. Meanwhile, for visual neurons, we propose an activation simulator to assess the reliability of the explanations for visual neurons. System statistical analyses on top of one representative VLM of LLaVA, uncover the behaviors/characteristics of different categories of neurons.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes