LGNov 18, 2015

Net2Net: Accelerating Learning via Knowledge Transfer

arXiv:1511.05641v4734 citations
Originality Incremental advance
AI Analysis

This accelerates the experimentation process for machine learning practitioners by reducing training time, though it is incremental as it builds on existing pre-training methods.

The paper tackles the problem of wasteful training from scratch during neural network experimentation by introducing Net2Net, a technique for instantaneously transferring knowledge from one neural net to a larger one, achieving a new state-of-the-art accuracy on ImageNet.

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes