LGJun 3, 2025

On Weak-to-Strong Generalization and f-Divergence

arXiv:2506.03109v12 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses computational efficiency in training strong AI models, but it is incremental as it builds on existing weak-to-strong generalization methods.

The paper tackles the problem of weak-to-strong generalization by introducing f-divergence as a loss function framework, showing it improves generalization and noise tolerance of strong models in practice.

Weak-to-strong generalization (W2SG) has emerged as a promising paradigm for stimulating the capabilities of strong pre-trained models by leveraging supervision from weaker supervisors. To improve the performance of the strong model, existing methods often require additional weak models or complex procedures, leading to substantial computational and memory overhead. Motivated by the effectiveness of $f$-divergence loss in various machine learning domains, we introduce $f$-divergence as an information-theoretic loss function framework in W2SG. Our theoretical analysis reveals fundamental limitations and equivalence of different $f$-divergence losses in W2SG, supported by sample complexity bounds and information-theoretic insights. We empirically demonstrate that $f$-divergence loss, which generalizes widely-used metrics like KL divergence, effectively improves generalization and noise tolerance of the strong model in practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes