LGDIS-NNMLJul 6, 2025

Transfer Learning in Infinite Width Feature Learning Networks

arXiv:2507.04448v1h-index: 28
Originality Highly original
AI Analysis

This work provides a foundational theory for understanding transfer learning in neural networks, which is crucial for improving efficiency in machine learning applications.

The authors developed a theoretical framework for transfer learning in infinitely wide neural networks operating in a feature learning regime, analyzing Bayesian and gradient flow settings to show how representations evolve and are reused via elastic weight coupling, with applications to regression tasks and real datasets revealing key factors like task alignment and dataset size.

We develop a theory of transfer learning in infinitely wide neural networks where both the pretraining (source) and downstream (target) task can operate in a feature learning regime. We analyze both the Bayesian framework, where learning is described by a posterior distribution over the weights, and gradient flow training of randomly initialized networks trained with weight decay. Both settings track how representations evolve in both source and target tasks. The summary statistics of these theories are adapted feature kernels which, after transfer learning, depend on data and labels from both source and target tasks. Reuse of features during transfer learning is controlled by an elastic weight coupling which controls the reliance of the network on features learned during training on the source task. We apply our theory to linear and polynomial regression tasks as well as real datasets. Our theory and experiments reveal interesting interplays between elastic weight coupling, feature learning strength, dataset size, and source and target task alignment on the utility of transfer learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes