CV NCDec 20, 2013

Efficient Visual Coding: From Retina To V2

arXiv:1312.6077v26 citations

Originality Incremental advance

AI Analysis

This work addresses the integration of psychophysiological and neurophysiological models for human vision, offering insights for computer vision, but it is incremental as it builds on an existing hierarchical model.

The authors tackled the problem of modeling the hierarchical visual system from retina to V2 by improving a recursive ICA model with sparse PCA, resulting in biologically-plausible receptive fields for multiple visual layers when applied to natural image patches. Their model provides predictions for neuroscience, such as suggesting that a prior study improperly discarded neurons capturing shape contours.

The human visual system has a hierarchical structure consisting of layers of processing, such as the retina, V1, V2, etc. Understanding the functional roles of these visual processing layers would help to integrate the psychophysiological and neurophysiological models into a consistent theory of human vision, and would also provide insights to computer vision research. One classical theory of the early visual pathway hypothesizes that it serves to capture the statistical structure of the visual inputs by efficiently coding the visual information in its outputs. Until recently, most computational models following this theory have focused upon explaining the receptive field properties of one or two visual layers. Recent work in deep networks has eliminated this concern, however, there is till the retinal layer to consider. Here we improve on a previously-described hierarchical model Recursive ICA (RICA) [1] which starts with PCA, followed by a layer of sparse coding or ICA, followed by a component-wise nonlinearity derived from considerations of the variable distributions expected by ICA. This process is then repeated. In this work, we improve on this model by using a new version of sparse PCA (sPCA), which results in biologically-plausible receptive fields for both the sPCA and ICA/sparse coding. When applied to natural image patches, our model learns visual features exhibiting the receptive field properties of retinal ganglion cells/lateral geniculate nucleus (LGN) cells, V1 simple cells, V1 complex cells, and V2 cells. Our work provides predictions for experimental neuroscience studies. For example, our result suggests that a previous neurophysiological study improperly discarded some of their recorded neurons; we predict that their discarded neurons capture the shape contour of objects.

View on arXiv PDF

Similar