Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification
It addresses zero-shot recognition for image classification, offering a flexible approach without requiring class labels during training, though it appears incremental in method.
The paper tackles zero-shot image classification by formulating semantic embedding as a metric learning problem, achieving state-of-the-art results on four challenging datasets.
This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images -- one of the main ingredients of zero-shot learning -- by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute prediction. This results in a novel expression of zero-shot learning not requiring the notion of class in the training phase: only pairs of image/attributes, augmented with a consistency indicator, are given as ground truth. At test time, the learned model can predict the consistency of a test image with a given set of attributes , allowing flexible ways to produce recognition inferences. Despite its simplicity, the proposed approach gives state-of-the-art results on four challenging datasets used for zero-shot recognition evaluation.