CV LG MLJul 11, 2020

Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding

arXiv:2007.05675v37.921 citations

Originality Incremental advance

AI Analysis

This work addresses a challenging few-shot learning scenario that reduces annotation costs for fine-grained tasks, though it is incremental in adapting existing methods to a new setting.

The paper tackles cross-granularity few-shot classification, where models use coarse labels during training but perform fine-grained classification at test time, reducing annotation costs; it achieves competitive results on three datasets by using coarse-to-fine pseudo-labeling with a visual-semantic meta-embedder.

Few-shot learning aims at rapidly adapting to novel categories with only a handful of samples at test time, which has been predominantly tackled with the idea of meta-learning. However, meta-learning approaches essentially learn across a variety of few-shot tasks and thus still require large-scale training data with fine-grained supervision to derive a generalized model, thereby involving prohibitive annotation cost. In this paper, we advance the few-shot classification paradigm towards a more challenging scenario, i.e., cross-granularity few-shot classification, where the model observes only coarse labels during training while is expected to perform fine-grained classification during testing. This task largely relieves the annotation cost since fine-grained labeling usually requires strong domain-specific expertise. To bridge the cross-granularity gap, we approximate the fine-grained data distribution by greedy clustering of each coarse-class into pseudo-fine-classes according to the similarity of image embeddings. We then propose a meta-embedder that jointly optimizes the visual- and semantic-discrimination, in both instance-wise and coarse class-wise, to obtain a good feature space for this coarse-to-fine pseudo-labeling process. Extensive experiments and ablation studies are conducted to demonstrate the effectiveness and robustness of our approach on three representative datasets.

View on arXiv PDF

Similar