HuCurl: Human-induced Curriculum Discovery
This work addresses curriculum learning for NLP practitioners by offering a method to discover better training sequences, though it appears incremental as it builds on prior curriculum learning concepts.
The paper tackles the problem of curriculum discovery by introducing a framework that identifies effective curricula based on sample difficulty measures like annotation entropy and loss, showing that top-performing curricula are often non-monotonic and outperform existing approaches across several NLP tasks.
We introduce the problem of curriculum discovery and describe a curriculum learning framework capable of discovering effective curricula in a curriculum space based on prior knowledge about sample difficulty. Using annotation entropy and loss as measures of difficulty, we show that (i): the top-performing discovered curricula for a given model and dataset are often non-monotonic as opposed to monotonic curricula in existing literature, (ii): the prevailing easy-to-hard or hard-to-easy transition curricula are often at the risk of underperforming, and (iii): the curricula discovered for smaller datasets and models perform well on larger datasets and models respectively. The proposed framework encompasses some of the existing curriculum learning approaches and can discover curricula that outperform them across several NLP tasks.