CVDec 30, 2020

Bidirectional Mapping Coupled GAN for Generalized Zero-Shot Learning

arXiv:2012.15054v29 citations
AI Analysis

This work provides an incremental improvement for researchers and practitioners working on generalized zero-shot learning by enhancing feature synthesis and domain distinction.

This paper addresses the challenge in generalized zero-shot learning (GZSL) where existing methods struggle to synthesize high-quality features for both seen and unseen classes due to only learning from seen data and failing to preserve domain distinction. The authors propose BMCoGAN, a bidirectional mapping coupled generative adversarial network that leverages both seen and unseen class semantics to learn a joint distribution, resulting in superior performance on benchmark datasets.

Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen domains and preserving domain distinction is crucial for these methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining domain distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the coupled generative adversarial network into a dual-domain learning bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining domain distinctive information in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes