CVAIFeb 13, 2012

Multi-column Deep Neural Networks for Image Classification

arXiv:1202.2745v14046 citations
AI Analysis

This work addresses the problem of improving image classification accuracy for tasks like handwriting and traffic sign recognition, representing a significant advance over traditional methods.

The paper tackles image classification by introducing a biologically plausible deep neural network architecture with multiple columns and winner-take-all neurons, achieving near-human performance on MNIST and outperforming humans by a factor of two on traffic sign recognition.

Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional winner-take-all neurons yield large network depth, resulting in roughly as many sparsely connected neural layers as found in mammals between retina and visual cortex. Only winner neurons are trained. Several deep neural columns become experts on inputs preprocessed in different ways; their predictions are averaged. Graphics cards allow for fast training. On the very competitive MNIST handwriting benchmark, our method is the first to achieve near-human performance. On a traffic sign recognition benchmark it outperforms humans by a factor of two. We also improve the state-of-the-art on a plethora of common image classification benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes