CVLGNov 12, 2020

Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement Learning

arXiv:2011.06190v11 citations
AI Analysis

This work addresses the need for efficient, real-time image classification models, particularly for applications requiring lightweight and fast processing, though it appears incremental as it builds on existing attention and reinforcement learning methods.

The paper tackles the problem of real-time image classification by proposing a lightweight deep neural network that uses stochastic retina-inspired glimpses and reinforcement learning, achieving performance that outperforms conventional CNNs on large-scale datasets like Large cluttered MNIST, Large CIFAR-10, and Large CIFAR-100.

Previous studies on image classification have mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM)- a reinforcement learning based lightweight deep neural network for large scale image classification that outperforms the conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes