Embedding Adaptation is Still Needed for Few-Shot Learning
This work addresses the need for better evaluation benchmarks in few-shot learning, offering a tool to generate tasksets that avoid overly optimistic or labor-intensive setups, though it is incremental in methodology.
The authors tackled the problem of constructing realistic few-shot learning tasksets by proposing ATG, a clustering method that models train and test distributions without human input, and found that gradient-based methods outperform metric-based ones in challenging transfer scenarios.
Constructing new and more challenging tasksets is a fruitful methodology to analyse and understand few-shot classification methods. Unfortunately, existing approaches to building those tasksets are somewhat unsatisfactory: they either assume train and test task distributions to be identical -- which leads to overly optimistic evaluations -- or take a "worst-case" philosophy -- which typically requires additional human labor such as obtaining semantic class relationships. We propose ATG, a principled clustering method to defining train and test tasksets without additional human knowledge. ATG models train and test task distributions while requiring them to share a predefined amount of information. We empirically demonstrate the effectiveness of ATG in generating tasksets that are easier, in-between, or harder than existing benchmarks, including those that rely on semantic information. Finally, we leverage our generated tasksets to shed a new light on few-shot classification: gradient-based methods -- previously believed to underperform -- can outperform metric-based ones when transfer is most challenging.