CV LGAug 14, 2018

Improving Generalization via Scalable Neighborhood Component Analysis

Zhirong Wu, Alexei A. Efros, Stella X. Yu

arXiv:1808.04699v125.3153 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of generalization in visual recognition for open-set scenarios, offering a scalable non-parametric method that improves feature representation, though it is incremental as it builds on existing NCA techniques.

The paper tackles the problem of visual recognition in open-set scenarios where new categories have few examples, by adopting a non-parametric approach that optimizes feature embeddings using Neighborhood Component Analysis (NCA) with augmented memory to scale it for large datasets. The result is a method that delivers remarkable performance on ImageNet classification and provides a more generalizable feature representation for sub-category discovery and few-shot recognition.

Current major approaches to visual recognition follow an end-to-end formulation that classifies an input image into one of the pre-determined set of semantic categories. Parametric softmax classifiers are a common choice for such a closed world with fixed categories, especially when big labeled data is available during training. However, this becomes problematic for open-set scenarios where new categories are encountered with very few examples for learning a generalizable parametric classifier. We adopt a non-parametric approach for visual recognition by optimizing feature embeddings instead of parametric classifiers. We use a deep neural network to learn the visual feature that preserves the neighborhood structure in the semantic space, based on the Neighborhood Component Analysis (NCA) criterion. Limited by its computational bottlenecks, we devise a mechanism to use augmented memory to scale NCA for large datasets and very deep networks. Our experiments deliver not only remarkable performance on ImageNet classification for such a simple non-parametric method, but most importantly a more generalizable feature representation for sub-category discovery and few-shot recognition.

View on arXiv PDF Code

Similar