Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning
This addresses sample efficiency problems in LLM training for mathematical reasoning, though it appears incremental as it builds on existing curriculum learning methods.
The paper tackles inefficient sample utilization and inflexible difficulty processing in LLM post-training by proposing Customized Curriculum Learning (CCL), which customizes curriculum datasets based on model capabilities and uses guided prompting to reduce sample difficulty. Experiments show CCL significantly outperforms uniform training approaches across five mathematical reasoning benchmarks.
Large Language Models (LLMs) have achieved remarkable performance across various reasoning tasks, yet post-training is constrained by inefficient sample utilization and inflexible difficulty samples processing. To address these limitations, we propose Customized Curriculum Learning (CCL), a novel framework with two key innovations. First, we introduce model-adaptive difficulty definition that customizes curriculum datasets based on each model's individual capabilities rather than using predefined difficulty metrics. Second, we develop "Guided Prompting," which dynamically reduces sample difficulty through strategic hints, enabling effective utilization of challenging samples that would otherwise degrade performance. Comprehensive experiments on supervised fine-tuning and reinforcement learning demonstrate that CCL significantly outperforms uniform training approaches across five mathematical reasoning benchmarks, confirming its effectiveness across both paradigms in enhancing sample utilization and model performance.