A Neurodynamic model of Saliency prediction in V1
This work addresses the need for a unified computational model of V1 for neuroscience and computer vision researchers, though it is incremental as it builds on existing neurodynamic models.
The authors tackled the problem of modeling multiple visual processes in V1 simultaneously, showing that their biologically plausible neurodynamic model (NSWAM) can predict saliency and other visual processes like brightness induction with accuracy similar to state-of-the-art methods, particularly on synthetic images.
Lateral connections in the primary visual cortex (V1) have long been hypothesized to be responsible of several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model's (NSWAM) architecture is based on Pennachio's neurodynamic model of lateral connections of V1. It is defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale. We tested NSWAM saliency predictions using images from several eye tracking datasets. We show that accuracy of predictions, using shuffled metrics, obtained by our architecture is similar to other state-of-the-art computational methods, particularly with synthetic images (CAT2000-Pattern & SID4VAM) which mainly contain low level features. Moreover, we outperform other biologically-inspired saliency models that are specifically designed to exclusively reproduce saliency. Hence, we show that our biologically plausible model of lateral connections can simultaneously explain different visual proceses present in V1 (without applying any type of training or optimization and keeping the same parametrization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex.