LG MLAug 26, 2020

What is being transferred in transfer learning?

Behnam Neyshabur, Hanie Sedghi, Chiyuan Zhang

arXiv:2008.11687v239.7619 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a fundamental problem in machine learning for researchers and practitioners by clarifying the mechanisms behind transfer learning, though it is incremental in nature.

The paper investigates what enables successful transfer learning in deep networks, showing that part of the benefit comes from learning low-level data statistics rather than just feature reuse, and that models stay in similar loss basins and parameter spaces when trained from pre-trained weights.

One desired capability for machines is the ability to transfer their knowledge of one domain to another where data is (usually) scarce. Despite ample adaptation of transfer learning in various deep learning applications, we yet do not understand what enables a successful transfer and which part of the network is responsible for that. In this paper, we provide new tools and analyses to address these fundamental questions. Through a series of analyses on transferring to block-shuffled images, we separate the effect of feature reuse from learning low-level statistics of data and show that some benefit of transfer learning comes from the latter. We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.

View on arXiv PDF Code

Similar