CVAug 24, 2017

A Robust Indoor Scene Recognition Method based on Sparse Representation

arXiv:1708.07555v117 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for indoor scene recognition in computer vision, addressing the loss of local details in standard CNN approaches.

The paper tackles indoor scene recognition by creating a new representation combining global and local CNN features with sparse coding to capture both environment structure and object details. The method outperforms previous approaches on Scene15 and MIT67 datasets, shows competitive performance on SUN397, and demonstrates robustness to image perturbations like noise and occlusion.

In this paper, we present a robust method for scene recognition, which leverages Convolutional Neural Networks (CNNs) features and Sparse Coding setting by creating a new representation of indoor scenes. Although CNNs highly benefited the fields of computer vision and pattern recognition, convolutional layers adjust weights on a global-approach, which might lead to losing important local details such as objects and small structures. Our proposed scene representation relies on both: global features that mostly refers to environment's structure, and local features that are sparsely combined to capture characteristics of common objects of a given scene. This new representation is based on fragments of the scene and leverages features extracted by CNNs. The experimental evaluation shows that the resulting representation outperforms previous scene recognition methods on Scene15 and MIT67 datasets, and performs competitively on SUN397, while being highly robust to perturbations in the input image such as noise and occlusion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes