CVLGOct 22, 2020

Zero-Shot Learning from scratch (ZFS): leveraging local compositional representations

arXiv:2010.13320v16 citations
Originality Synthesis-oriented
AI Analysis

This addresses the issue of disentangling representation learning from pre-trained parameters for researchers in zero-shot classification, though it is incremental as it refines the evaluation setting rather than introducing a new method.

The paper tackles the problem of zero-shot learning by proposing a more challenging setting called Zero-Shot Learning from scratch (ZFS), which forbids the use of pre-trained encoders from other datasets, and finds that local information and compositional representations are crucial for performance.

Zero-shot classification is a generalization task where no instance from the target classes is seen during training. To allow for test-time transfer, each class is annotated with semantic information, commonly in the form of attributes or text descriptions. While classical zero-shot learning does not explicitly forbid using information from other datasets, the approaches that achieve the best absolute performance on image benchmarks rely on features extracted from encoders pretrained on Imagenet. This approach relies on hyper-optimized Imagenet-relevant parameters from the supervised classification setting, entangling important questions about the suitability of those parameters and how they were learned with more fundamental questions about representation learning and generalization. To remove these distractors, we propose a more challenging setting: Zero-Shot Learning from scratch (ZFS), which explicitly forbids the use of encoders fine-tuned on other datasets. Our analysis on this setting highlights the importance of local information, and compositional representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes