CVIVDec 21, 2019

Convolutional Neural Networks: A Binocular Vision Perspective

arXiv:1912.10201v1
Originality Incremental advance
AI Analysis

It addresses a fundamental question in improving deep learning applications for visual imagery analysis by mimicking biological structures, though it is incremental as it builds on existing CNN frameworks.

This position paper tackles the problem of whether monocular image datasets are sufficient for training CNNs by proposing a binocular CNN architecture inspired by the human visual system, and experimental results show it leads to better average classification rates compared to classical CNNs.

It is arguable that whether the single camera captured (monocular) image datasets are sufficient enough to train and test convolutional neural networks (CNNs) for imitating the biological neural network structures of the human brain. As human visual system works in binocular, the collaboration of the eyes with the two brain lobes needs more investigation for improvements in such CNN-based visual imagery analysis applications. It is indeed questionable that if respective visual fields of each eye and the associated brain lobes are responsible for different learning abilities of the same scene. There are such open questions in this field of research which need rigorous investigation in order to further understand the nature of the human visual system, hence improve the currently available deep learning applications. This position paper analyses a binocular CNNs architecture that is more analogous to the biological structure of the human visual system than the conventional deep learning techniques. While taking a structure called optic chiasma into account, this architecture consists of basically two parallel CNN structures associated with each visual field and the brain lobe, fully connected later possibly as in the primary visual cortex (V1). Experimental results demonstrate that binocular learning of two different visual fields leads to better classification rates on average, when compared to classical CNN architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes