CVJun 24, 2015

A Novel Feature Extraction Method for Scene Recognition Based on Centered Convolutional Restricted Boltzmann Machines

arXiv:1506.07257v144 citations
Originality Incremental advance
AI Analysis

This work addresses scene recognition in computer vision, offering an incremental improvement in feature extraction methods for handling large images.

The authors tackled the problem of scene recognition by proposing a novel feature extraction method called Centered Convolutional Restricted Boltzmann Machines (CCRBM), which improves stability and performance over existing models, achieving better results on datasets like MIT-indoor scenes and Caltech 101.

Scene recognition is an important research topic in computer vision, while feature extraction is a key step of object recognition. Although classical Restricted Boltzmann machines (RBM) can efficiently represent complicated data, it is hard to handle large images due to its complexity in computation. In this paper, a novel feature extraction method, named Centered Convolutional Restricted Boltzmann Machines (CCRBM), is proposed for scene recognition. The proposed model is an improved Convolutional Restricted Boltzmann Machines (CRBM) by introducing centered factors in its learning strategy to reduce the source of instabilities. First, the visible units of the network are redefined using centered factors. Then, the hidden units are learned with a modified energy function by utilizing a distribution function, and the visible units are reconstructed using the learned hidden units. In order to achieve better generative ability, the Centered Convolutional Deep Belief Networks (CCDBN) is trained in a greedy layer-wise way. Finally, a softmax regression is incorporated for scene recognition. Extensive experimental evaluations using natural scenes, MIT-indoor scenes, and Caltech 101 datasets show that the proposed approach performs better than other counterparts in terms of stability, generalization, and discrimination. The CCDBN model is more suitable for natural scene image recognition by virtue of convolutional property.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes