LGFeb 5, 2018

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

arXiv:1802.01483v2426 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of feature retention in transfer learning for practitioners using convolutional networks, though it is incremental.

The paper tackles the problem of retaining useful features from pre-trained models during fine-tuning in transfer learning by proposing explicit regularization schemes, and finds that a simple L2 penalty with the pre-trained model as reference improves performance.

In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which are at least partially relevant for solving the target task, but would be difficult to extract from the limited amount of data available on the target task. However, besides the initialization with the pre-trained model and the early stopping, there is no mechanism in fine-tuning for retaining the features learned on the source task. In this paper, we investigate several regularization schemes that explicitly promote the similarity of the final solution with the initial model. We show the benefit of having an explicit inductive bias towards the initial model, and we eventually recommend a simple $L^2$ penalty with the pre-trained model being a reference as the baseline of penalty for transfer learning tasks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes