CL LGJul 13, 2025

Your Pretrained Model Tells the Difficulty Itself: A Self-Adaptive Curriculum Learning Paradigm for Natural Language Understanding

arXiv:2507.09758v19.63 citationsh-index: 13ACL

Originality Incremental advance

AI Analysis

This addresses the challenge of inaccurate difficulty metrics in curriculum learning for NLP practitioners, though it is incremental as it builds on existing curriculum learning methods.

The paper tackled the problem of curriculum learning in NLP by introducing a self-adaptive paradigm where pre-trained models predict difficulty scores for fine-tuning examples, leading to faster convergence and improved performance on NLU datasets compared to random sampling.

Curriculum learning is a widely adopted training strategy in natural language processing (NLP), where models are exposed to examples organized by increasing difficulty to enhance learning efficiency and performance. However, most existing approaches rely on manually defined difficulty metrics -- such as text length -- which may not accurately reflect the model's own perspective. To overcome this limitation, we present a self-adaptive curriculum learning paradigm that prioritizes fine-tuning examples based on difficulty scores predicted by pre-trained language models (PLMs) themselves. Building on these scores, we explore various training strategies that differ in the ordering of examples for the fine-tuning: from easy-to-hard, hard-to-easy, to mixed sampling. We evaluate our method on four natural language understanding (NLU) datasets covering both binary and multi-class classification tasks. Experimental results show that our approach leads to faster convergence and improved performance compared to standard random sampling.

View on arXiv PDF

Similar