Semi-supervised Vocabulary-informed Learning
This addresses the problem of scalable object recognition for computer vision applications, but it is incremental as it builds on existing zero-shot and open set methods.
The paper tackles the challenges of learning from limited labeled data and recognizing objects in large, open sets by proposing a semi-supervised vocabulary-informed learning framework, which improves supervised, zero-shot, and open set recognition with results on datasets like AwA and ImageNet up to 310K classes.
Despite significant progress in object categorization, in recent years, a number of important challenges remain, mainly, ability to learn from limited labeled data and ability to recognize object classes within large, potentially open, set of labels. Zero-shot learning is one way of addressing these challenges, but it has only been shown to work with limited sized class vocabularies and typically requires separation between supervised and unsupervised classes, allowing former to inform the latter but not vice versa. We propose the notion of semi-supervised vocabulary-informed learning to alleviate the above mentioned challenges and address problems of supervised, zero-shot and open set recognition using a unified framework. Specifically, we propose a maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms, ensuring that labeled samples are projected closest to their correct prototypes, in the embedding space, than to others. We show that resulting model shows improvements in supervised, zero-shot, and large open set recognition, with up to 310K class vocabulary on AwA and ImageNet datasets.