CVApr 1, 2020

Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition

Luming Tang, Davis Wertheimer, Bharath Hariharan

arXiv:2004.00705v114.361 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of learning subtle distinctions between classes with limited data, offering a practical solution for fine-grained recognition tasks.

The paper tackles the problem of few-shot fine-grained classification by demonstrating that pose-normalized representations, which localize and describe semantic parts, significantly improve accuracy by 10-20 percentage points across various architectures and algorithms.

Few-shot, fine-grained classification requires a model to learn subtle, fine-grained distinctions between different classes (e.g., birds) based on a few images alone. This requires a remarkable degree of invariance to pose, articulation and background. A solution is to use pose-normalized representations: first localize semantic parts in each image, and then describe images by characterizing the appearance of each part. While such representations are out of favor for fully supervised classification, we show that they are extremely effective for few-shot fine-grained classification. With a minimal increase in model capacity, pose normalization improves accuracy between 10 and 20 percentage points for shallow and deep architectures, generalizes better to new domains, and is effective for multiple few-shot algorithms and network backbones. Code is available at https://github.com/Tsingularity/PoseNorm_Fewshot

View on arXiv PDF Code

Similar