CVOct 7, 2020

Learning Clusterable Visual Features for Zero-Shot Recognition

arXiv:2010.03245v2
Originality Incremental advance
AI Analysis

This work addresses performance degradation in zero-shot recognition for computer vision applications, representing an incremental advance.

The paper tackles the problem of 'hard' testing data in zero-shot learning by learning clusterable visual features, resulting in consistent improvements over previous state-of-the-art results on SUN, CUB, and AWA2 datasets.

In zero-shot learning (ZSL), conditional generators have been widely used to generate additional training features. These features can then be used to train the classifiers for testing data. However, some testing data are considered "hard" as they lie close to the decision boundaries and are prone to misclassification, leading to performance degradation for ZSL. In this paper, we propose to learn clusterable features for ZSL problems. Using a Conditional Variational Autoencoder (CVAE) as the feature generator, we project the original features to a new feature space supervised by an auxiliary classification loss. To further increase clusterability, we fine-tune the features using Gaussian similarity loss. The clusterable visual features are not only more suitable for CVAE reconstruction but are also more separable which improves classification accuracy. Moreover, we introduce Gaussian noise to enlarge the intra-class variance of the generated features, which helps to improve the classifier's robustness. Our experiments on SUN,CUB, and AWA2 datasets show consistent improvement over previous state-of-the-art ZSL results by a large margin. In addition to its effectiveness on zero-shot classification, experiments show that our method to increase feature clusterability benefits few-shot learning algorithms as well.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes