CVLGOct 19, 2018

Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning

arXiv:1810.08329v122 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of recognizing unseen classes without training samples for researchers in computer vision, though it is incremental as it builds on existing ZSL methods by incorporating superclass hierarchies.

The paper tackles the domain gap challenge in zero-shot learning by leveraging automatically generated superclasses as a bridge between seen and unseen classes, resulting in a model that significantly outperforms state-of-the-art alternatives in both ZSL and few-shot learning tasks.

Zero-shot learning (ZSL) aims to transfer knowledge from seen classes to unseen ones so that the latter can be recognised without any training samples. This is made possible by learning a projection function between a feature space and a semantic space (e.g. attribute space). Considering the seen and unseen classes as two domains, a big domain gap often exists which challenges ZSL. Inspired by the fact that an unseen class is not exactly `unseen' if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap. Specifically, we first build a class hierarchy of multiple superclass layers and a single class layer, where the superclasses are automatically generated by data-driven clustering over the semantic representations of all seen and unseen class names. We then exploit the superclasses from the class hierarchy to tackle the domain gap challenge in two aspects: deep feature learning and projection function learning. First, to narrow the domain gap in the feature space, we integrate a recurrent neural network (RNN) defined with the superclasses into a convolutional neural network (CNN), in order to enforce the superclass hierarchy. Second, to further learn a transferrable projection function for ZSL, a novel projection function learning method is proposed by exploiting the superclasses to align the two domains. Importantly, our transferrable feature and projection learning methods can be easily extended to a closely related task -- few-shot learning (FSL). Extensive experiments show that the proposed model significantly outperforms the state-of-the-art alternatives in both ZSL and FSL tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes