LGNEMLJun 7, 2015

Knowledge Transfer Pre-training

arXiv:1506.02256v17 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in pre-training for complex neural architectures, offering a flexible method that is incremental but extends existing techniques.

The paper tackles the problem of pre-training models without clear multi-layer structures, such as recurrent neural networks, by introducing a knowledge transfer approach that uses soft targets from a teacher model, enabling effective training of complex RNNs on a speech recognition task with a weaker DNN model and achieving additional gains when combined with conventional methods.

Pre-training is crucial for learning deep neural networks. Most of existing pre-training methods train simple models (e.g., restricted Boltzmann machines) and then stack them layer by layer to form the deep structure. This layer-wise pre-training has found strong theoretical foundation and broad empirical support. However, it is not easy to employ such method to pre-train models without a clear multi-layer structure,e.g., recurrent neural networks (RNNs). This paper presents a new pre-training approach based on knowledge transfer learning. In contrast to the layer-wise approach which trains model components incrementally, the new approach trains the entire model as a whole but with an easier objective function. This is achieved by utilizing soft targets produced by a prior trained model (teacher model). Compared to the conventional layer-wise methods, this new method does not care about the model structure, so can be used to pre-train very complex models. Experiments on a speech recognition task demonstrated that with this approach, complex RNNs can be well trained with a weaker deep neural network (DNN) model. Furthermore, the new method can be combined with conventional layer-wise pre-training to deliver additional gains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes