A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
This addresses the problem of recognizing unseen classes without visual examples for computer vision applications, representing an incremental improvement over existing methods.
The paper tackles zero-shot learning by using GANs to generate visual features from noisy text descriptions, converting the problem into traditional classification and achieving state-of-the-art performance on major benchmarks.
Most existing zero-shot learning methods consider the problem as a visual semantic embedding one. Given the demonstrated capability of Generative Adversarial Networks(GANs) to generate images, we instead leverage GANs to imagine unseen categories from text descriptions and hence recognize novel classes with no examples being seen. Specifically, we propose a simple yet effective generative model that takes as input noisy text descriptions about an unseen class (e.g.Wikipedia articles) and generates synthesized visual features for this class. With added pseudo data, zero-shot learning is naturally converted to a traditional classification problem. Additionally, to preserve the inter-class discrimination of the generated features, a visual pivot regularization is proposed as an explicit supervision. Unlike previous methods using complex engineered regularizers, our approach can suppress the noise well without additional regularization. Empirically, we show that our method consistently outperforms the state of the art on the largest available benchmarks on Text-based Zero-shot Learning.