CVCLJul 12, 2021

Zero-Shot Compositional Concept Learning

arXiv:2107.05176v1712 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of low-resource learning for compositional concepts in computer vision, though it is incremental as it builds on existing zero-shot learning frameworks.

The paper tackles the problem of recognizing novel compositional attribute-object concepts in zero-shot learning by proposing an episode-based cross-attention network, achieving improved performance on two benchmarks compared to recent methods.

In this paper, we study the problem of recognizing compositional attribute-object concepts within the zero-shot learning (ZSL) framework. We propose an episode-based cross-attention (EpiCA) network which combines merits of cross-attention mechanism and episode-based training strategy to recognize novel compositional concepts. Firstly, EpiCA bases on cross-attention to correlate concept-visual information and utilizes the gated pooling layer to build contextualized representations for both images and concepts. The updated representations are used for a more in-depth multi-modal relevance calculation for concept recognition. Secondly, a two-phase episode training strategy, especially the transductive phase, is adopted to utilize unlabeled test examples to alleviate the low-resource learning problem. Experiments on two widely-used zero-shot compositional learning (ZSCL) benchmarks have demonstrated the effectiveness of the model compared with recent approaches on both conventional and generalized ZSCL settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes