CLOct 4, 2021

Revisiting Self-Training for Few-Shot Learning of Language Model

Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, Haizhou Li

arXiv:2110.01256v130.9663 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of few-shot learning for language models, offering a robust and efficient solution that is incremental but improves performance across various settings.

The authors tackled the problem of effectively using unlabeled data for few-shot learning in language models by revisiting self-training, resulting in a state-of-the-art method that outperforms other approaches on multiple benchmarking tasks.

As unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. The question is how to effectively make use of such data. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM. Given two views of a text sample via weak and strong augmentation techniques, SFLM generates a pseudo label on the weakly augmented version. Then, the model predicts the same pseudo label when fine-tuned with the strongly augmented version. This simple approach is shown to outperform other state-of-the-art supervised and semi-supervised counterparts on six sentence classification and six sentence-pair classification benchmarking tasks. In addition, SFLM only relies on a few in-domain unlabeled data. We conduct a comprehensive analysis to demonstrate the robustness of our proposed approach under various settings, including augmentation techniques, model scale, and few-shot knowledge transfer across tasks.

View on arXiv PDF Code

Similar