CVOct 7, 2020

Learning Clusterable Visual Features for Zero-Shot Recognition

arXiv:2010.03245v21.2

Originality Incremental advance

AI Analysis

This work addresses performance degradation in zero-shot recognition for computer vision applications, representing an incremental advance.

The paper tackles the problem of 'hard' testing data in zero-shot learning by learning clusterable visual features, resulting in consistent improvements over previous state-of-the-art results on SUN, CUB, and AWA2 datasets.

In zero-shot learning (ZSL), conditional generators have been widely used to generate additional training features. These features can then be used to train the classifiers for testing data. However, some testing data are considered "hard" as they lie close to the decision boundaries and are prone to misclassification, leading to performance degradation for ZSL. In this paper, we propose to learn clusterable features for ZSL problems. Using a Conditional Variational Autoencoder (CVAE) as the feature generator, we project the original features to a new feature space supervised by an auxiliary classification loss. To further increase clusterability, we fine-tune the features using Gaussian similarity loss. The clusterable visual features are not only more suitable for CVAE reconstruction but are also more separable which improves classification accuracy. Moreover, we introduce Gaussian noise to enlarge the intra-class variance of the generated features, which helps to improve the classifier's robustness. Our experiments on SUN,CUB, and AWA2 datasets show consistent improvement over previous state-of-the-art ZSL results by a large margin. In addition to its effectiveness on zero-shot classification, experiments show that our method to increase feature clusterability benefits few-shot learning algorithms as well.

View on arXiv PDF

Similar