MLLGSTNov 11, 2025

Source-Optimal Training is Transfer-Suboptimal

arXiv:2511.08401v1
Originality Highly original
AI Analysis

This addresses a fundamental misalignment in transfer learning for machine learning practitioners, revealing counterintuitive regularization needs that can improve model performance when transferring between tasks.

The paper proves that the source regularization minimizing source risk does not align with the regularization maximizing transfer benefit, showing through L2-SP ridge regression that transfer-optimal penalties diverge from task-optimal ones, requiring stronger regularization in high-SNR regimes and weaker in low-SNR regimes, with experiments on CIFAR-10 and MNIST confirming this pattern in non-linear networks.

We prove a fundamental misalignment in transfer learning: the source regularization that minimizes source risk almost never coincides with the regularization maximizing transfer benefit. Through sharp phase boundaries for L2-SP ridge regression, we characterize the transfer-optimal source penalty $τ_0^*$ and show it diverges predictably from task-optimal values, requiring stronger regularization in high-SNR regimes and weaker regularization in low-SNR regimes. Additionally, in isotropic settings the decision to transfer is remarkably independent of target sample size and noise, depending only on task alignment and source characteristics. CIFAR-10 and MNIST experiments confirm this counterintuitive pattern persists in non-linear networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes