CV NCJun 12, 2024

Self-Attention-Based Contextual Modulation Improves Neural System Identification

Isaac Lin, Tianye Wang, Shang Gao, Shiming Tang, Tai Sing Lee

arXiv:2406.07843v32.0

Originality Incremental advance

AI Analysis

This work addresses the challenge of accurately capturing surround-center interactions in neural system identification for neuroscience applications, representing an incremental improvement over existing methods.

The paper tackled the problem of modeling contextual modulation in visual cortical neurons by introducing self-attention mechanisms into CNNs, resulting in improved neural response predictions with gains in tuning curve correlation and peak tuning metrics.

Convolutional neural networks (CNNs) have been shown to be state-of-the-art models for visual cortical neurons. Cortical neurons in the primary visual cortex are sensitive to contextual information mediated by extensive horizontal and feedback connections. Standard CNNs integrate global contextual information to model contextual modulation via two mechanisms: successive convolutions and a fully connected readout layer. In this paper, we find that self-attention (SA), an implementation of non-local network mechanisms, can improve neural response predictions over parameter-matched CNNs in two key metrics: tuning curve correlation and peak tuning. We introduce peak tuning as a metric to evaluate a model's ability to capture a neuron's top feature preference. We factorize networks to assess each context mechanism, revealing that information in the local receptive field is most important for modeling overall tuning, but surround information is critically necessary for characterizing the tuning peak. We find that self-attention can replace posterior spatial-integration convolutions when learned incrementally, and is further enhanced in the presence of a fully connected readout layer, suggesting that the two context mechanisms are complementary. Finally, we find that decomposing receptive field learning and contextual modulation learning in an incremental manner may be an effective and robust mechanism for learning surround-center interactions.

View on arXiv PDF

Similar