Class Knowledge Overlay to Visual Feature Learning for Zero-Shot Image Classification
This work addresses the problem of generating high-quality synthesized visual features for zero-shot image classification, which is incremental as it builds on existing generative adversarial network methods to improve semantic consistency.
The paper tackles the challenge of ensuring semantic consistency between semantic and visual features in zero-shot image classification by proposing GAN-CST, a novel approach using class knowledge overlay, semi-supervised learning, and triplet loss, which achieves superior performance over state-of-the-art methods on benchmark datasets.
New categories can be discovered by transforming semantic features into synthesized visual features without corresponding training samples in zero-shot image classification. Although significant progress has been made in generating high-quality synthesized visual features using generative adversarial networks, guaranteeing semantic consistency between the semantic features and visual features remains very challenging. In this paper, we propose a novel zero-shot learning approach, GAN-CST, based on class knowledge to visual feature learning to tackle the problem. The approach consists of three parts, class knowledge overlay, semi-supervised learning and triplet loss. It applies class knowledge overlay (CKO) to obtain knowledge not only from the corresponding class but also from other classes that have the knowledge overlay. It ensures that the knowledge-to-visual learning process has adequate information to generate synthesized visual features. The approach also applies a semi-supervised learning process to re-train knowledge-to-visual model. It contributes to reinforcing synthesized visual features generation as well as new category prediction. We tabulate results on a number of benchmark datasets demonstrating that the proposed model delivers superior performance over state-of-the-art approaches.