NC CV NEJul 18, 2014

Pixels to Voxels: Modeling Visual Representation in the Human Brain

Pulkit Agrawal, Dustin Stansbury, Jitendra Malik, Jack L. Gallant

arXiv:1407.5104v1111 citations

Originality Highly original

AI Analysis

This provides a new platform for exploring human vision principles, addressing a fundamental limitation in neuroscience by enabling models to operate directly on pixels rather than hand-annotated images.

The authors tackled the problem of predicting human brain activity from low-level visual input (pixels) without semantic tags, and found that both Fisher Vectors and Convolutional Neural Networks accurately predict activity in high-level visual areas, achieving the first such mapping.

The human brain is adept at solving difficult high-level visual processing problems such as image interpretation and object recognition in natural scenes. Over the past few years neuroscientists have made remarkable progress in understanding how the human brain represents categories of objects and actions in natural scenes. However, all current models of high-level human vision operate on hand annotated images in which the objects and actions have been assigned semantic tags by a human operator. No current models can account for high-level visual function directly in terms of low-level visual input (i.e., pixels). To overcome this fundamental limitation we sought to develop a new class of models that can predict human brain activity directly from low-level visual input (i.e., pixels). We explored two classes of models drawn from computer vision and machine learning. The first class of models was based on Fisher Vectors (FV) and the second was based on Convolutional Neural Networks (ConvNets). We find that both classes of models accurately predict brain activity in high-level visual areas, directly from pixels and without the need for any semantic tags or hand annotation of images. This is the first time that such a mapping has been obtained. The fit models provide a new platform for exploring the functional principles of human vision, and they show that modern methods of computer vision and machine learning provide important tools for characterizing brain function.

View on arXiv PDF

Similar