MLLGAug 14, 2017

Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

arXiv:1708.04106v3128 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient and accurate models in time-sensitive applications like online advertising, representing an incremental improvement in knowledge distillation techniques.

The paper tackles the problem of training lightweight neural networks for real-time tasks like click-through rate prediction, where high accuracy and low inference time are required, by proposing a 'rocket launching' framework that uses a cumbersome 'booster' net to guide the light net, resulting in performance comparable to more complex models.

Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes