Enabling the Network to Surf the Internet
This work addresses the problem of data scarcity in few-shot learning for AI researchers, offering an incremental improvement by automating data collection and enhancing generalization.
The paper tackles the challenge of few-shot learning by developing a framework that enables models to autonomously collect and annotate data from the Internet, eliminating the need for manual effort. It also introduces a normalization strategy that boosts accuracy by up to 20.46%, achieving performance comparable to supervised methods and surpassing unsupervised ones by over 10% on datasets like miniImageNet.
Few-shot learning is challenging due to the limited data and labels. Existing algorithms usually resolve this problem by pre-training the model with a considerable amount of annotated data which shares knowledge with the target domain. Nevertheless, large quantities of homogenous data samples are not always available. To tackle this issue, we develop a framework that enables the model to surf the Internet, which implies that the model can collect and annotate data without manual effort. Since the online data is virtually limitless and continues to be generated, the model can thus be empowered to constantly obtain up-to-date knowledge from the Internet. Additionally, we observe that the generalization ability of the learned representation is crucial for self-supervised learning. To present its importance, a naive yet efficient normalization strategy is proposed. Consequentially, this strategy boosts the accuracy of the model significantly (20.46% at most). We demonstrate the superiority of the proposed framework with experiments on miniImageNet, tieredImageNet and Omniglot. The results indicate that our method has surpassed previous unsupervised counterparts by a large margin (more than 10%) and obtained performance comparable with the supervised ones.