CL AIJul 15, 2021

FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark

Liang Xu, Xiaojing Lu, Chenyang Yuan, Xuanwei Zhang, Huilin Xu, Hu Yuan, Guoao Wei, Xiang Pan, Xin Tian, Libo Qin, Hu Hai

arXiv:2107.07498v26.159 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a standardized benchmark for researchers and practitioners in Chinese NLP to evaluate and compare few-shot learning methods, facilitating cumulative progress in the field.

The authors tackled the lack of a comprehensive evaluation benchmark for few-shot learning in Chinese by introducing FewCLUE, which includes nine tasks and systematically compares five state-of-the-art methods, finding that PET and P-tuning achieve the best overall performance with specific pre-trained models.

Pretrained Language Models (PLMs) have achieved tremendous success in natural language understanding tasks. While different learning schemes -- fine-tuning, zero-shot, and few-shot learning -- have been widely explored and compared for languages such as English, there is comparatively little work in Chinese to fairly and comprehensively evaluate and compare these methods and thus hinders cumulative progress. In this paper, we introduce the Chinese Few-shot Learning Evaluation Benchmark (FewCLUE), the first comprehensive few-shot evaluation benchmark in Chinese. It includes nine tasks, ranging from single-sentence and sentence-pair classification tasks to machine reading comprehension tasks. We systematically evaluate five state-of-the-art (SOTA) few-shot learning methods (including PET, ADAPET, LM-BFF, P-tuning and EFL), and compare their performance with fine-tuning and zero-shot learning schemes on the newly constructed FewCLUE benchmark. Experimental results reveal that: 1) The effect of different few-shot learning methods is sensitive to the pre-trained model to which the methods are applied; 2) PET and P-tuning achieve the best overall performance with RoBERTa and ERNIE respectively. Our benchmark is used in the few-shot learning contest of NLPCC 2021. In addition, we provide a user-friendly toolkit, as well as an online leaderboard to help facilitate further progress on Chinese few-shot learning. We provide a baseline performance on different learning methods, a reference for future research.

View on arXiv PDF Code

Similar