CVMar 26, 2021

MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning

Baoquan Zhang, Xutao Li, Shanshan Feng, Yunming Ye, Rui Ye

arXiv:2103.14341v28.746 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses prototype bias in few-shot learning for computer vision applications, offering an incremental improvement over existing meta-optimizers by mitigating gradient bias.

The paper tackles the problem of biased prototypes in few-shot learning due to data scarcity by proposing MetaNODE, a meta-learning framework that optimizes prototypes using a Neural ODE-based meta-optimizer to estimate continuous gradient flow, achieving improved performance on datasets like miniImagenet, tieredImagenet, and CUB-200-2011.

Few-Shot Learning (FSL) is a challenging task, \emph{i.e.}, how to recognize novel classes with few examples? Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then predicting novel classes via a cosine nearest neighbor classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel meta-learning based prototype optimization framework to rectify prototypes, \emph{i.e.}, introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, \emph{i.e.}, the mean-based gradient estimation is also biased on sparse data. To address the issue, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to polish prototypes, called MetaNODE. In this meta-optimizer, we first view the mean-based prototypes as initial prototypes, and then model the process of prototype optimization as continuous-time dynamics specified by a Neural ODE. A gradient flow inference network is carefully designed to learn to estimate the continuous gradient flow for prototype dynamics. Finally, the optimal prototypes can be obtained by solving the Neural ODE. Extensive experiments on miniImagenet, tieredImagenet, and CUB-200-2011 show the effectiveness of our method.

View on arXiv PDF Code

Similar