CVJun 17, 2015

A Discriminative Representation of Convolutional Features for Indoor Scene Recognition

arXiv:1506.05196v174 citations
Originality Highly original
AI Analysis

This work addresses the challenging problem of indoor scene recognition for computer vision applications, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles indoor scene recognition by transforming structured convolutional features into a discriminative space that incorporates dataset-specific discriminative aspects and encodes general object categories, achieving a significant performance boost over previous state-of-the-art approaches on five major datasets.

Indoor scene recognition is a multi-faceted and challenging problem due to the diverse intra-class variations and the confusing inter-class similarities. This paper presents a novel approach which exploits rich mid-level convolutional features to categorize indoor scenes. Traditionally used convolutional features preserve the global spatial structure, which is a desirable property for general object recognition. However, we argue that this structuredness is not much helpful when we have large variations in scene layouts, e.g., in indoor scenes. We propose to transform the structured convolutional activations to another highly discriminative feature space. The representation in the transformed space not only incorporates the discriminative aspects of the target dataset, but it also encodes the features in terms of the general object categories that are present in indoor scenes. To this end, we introduce a new large-scale dataset of 1300 object categories which are commonly present in indoor scenes. Our proposed approach achieves a significant performance boost over previous state of the art approaches on five major scene classification datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes