LGAIAug 22, 2025

WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning

arXiv:2508.16741v1h-index: 1
Originality Highly original
AI Analysis

This addresses the problem of efficient and safe prompt refinement for large language models, particularly in settings where models are closed-source or hard to fine-tune, offering a scalable solution.

The paper tackles the challenge of effective prompt engineering by introducing Weak-to-Strong Transfer (WST), an automatic framework where a small teacher model generates instructions to enhance a larger student model, resulting in substantial gains such as 98% on MATH-500 and 134% on HH-RLHF benchmarks.

Effective prompt engineering remains a challenging task for many applications. We introduce Weak-to-Strong Transfer (WST), an automatic prompt engineering framework where a small "Teacher" model generates instructions that enhance the performance of a much larger "Student" model. Unlike prior work, WST requires only a weak teacher, making it efficient and broadly applicable in settings where large models are closed-source or difficult to fine-tune. Using reinforcement learning, the Teacher Model's instructions are iteratively improved based on the Student Model's outcomes, yielding substantial gains across reasoning (MATH-500, GSM8K) and alignment (HH-RLHF) benchmarks - 98% on MATH-500 and 134% on HH-RLHF - and surpassing baselines such as GPT-4o-mini and Llama-70B. These results demonstrate that small models can reliably scaffold larger ones, unlocking latent capabilities while avoiding misleading prompts that stronger teachers may introduce, establishing WST as a scalable solution for efficient and safe LLM prompt refinement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes