CV LG NCOct 1, 2020

Neural encoding with visual attention

Meenakshi Khosla, Gia H. Ngo, Keith Jamison, Amy Kuceyeski, Mert R. Sabuncu

arXiv:2010.00516v13.36 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of modeling neural encoding in visual perception, offering a method to infer attention without eye-tracking, though it is incremental in combining existing techniques.

The study tackled the problem of predicting brain responses to visual stimuli by incorporating attention mechanisms, showing that using gaze data improves prediction accuracy and that a trainable attention module can learn attention policies from fMRI data alone, aligning with actual eye fixation patterns.

Visual perception is critically influenced by the focus of attention. Due to limited resources, it is well known that neural representations are biased in favor of attended locations. Using concurrent eye-tracking and functional Magnetic Resonance Imaging (fMRI) recordings from a large cohort of human subjects watching movies, we first demonstrate that leveraging gaze information, in the form of attentional masking, can significantly improve brain response prediction accuracy in a neural encoding model. Next, we propose a novel approach to neural encoding by including a trainable soft-attention module. Using our new approach, we demonstrate that it is possible to learn visual attention policies by end-to-end learning merely on fMRI response data, and without relying on any eye-tracking. Interestingly, we find that attention locations estimated by the model on independent data agree well with the corresponding eye fixation patterns, despite no explicit supervision to do so. Together, these findings suggest that attention modules can be instrumental in neural encoding models of visual stimuli.

View on arXiv PDF

Similar