CVMar 2, 2016

Synthesized Classifiers for Zero-Shot Learning

arXiv:1603.00550v3752 citations
AI Analysis

It addresses the problem of recognizing unseen object classes without training examples for computer vision applications, representing an incremental improvement.

The paper tackles zero-shot learning by aligning semantic and visual feature spaces using optimized 'phantom' classes as bases, achieving superior accuracy on four benchmarks including ImageNet with over 20,000 unseen classes.

Given semantic descriptions of object classes, zero-shot learning aims to accurately recognize objects of the unseen classes, from which no examples are available at the training stage, by associating them to the seen classes, from which labeled examples are provided. We propose to tackle this problem from the perspective of manifold learning. Our main idea is to align the semantic space that is derived from external information to the model space that concerns itself with recognizing visual features. To this end, we introduce a set of "phantom" object classes whose coordinates live in both the semantic space and the model space. Serving as bases in a dictionary, they can be optimized from labeled data such that the synthesized real object classifiers achieve optimal discriminative performance. We demonstrate superior accuracy of our approach over the state of the art on four benchmark datasets for zero-shot learning, including the full ImageNet Fall 2011 dataset with more than 20,000 unseen classes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes