CLLGMay 16, 2020

MicroNet for Efficient Language Modeling

arXiv:2005.07877v18 citationsHas Code
AI Analysis

This work addresses the need for efficient language models for deployment in resource-constrained environments, representing an incremental improvement by combining existing techniques.

The paper tackled the problem of designing compact language models for efficient deployment by improving upon recent advances in language modeling and model compression, resulting in a model that is 90 times more parameter-efficient and 36 times more computation-efficient while achieving a test perplexity of 35 on the Wikitext-103 dataset.

It is important to design compact language models for efficient deployment. We improve upon recent advances in both the language modeling domain and the model-compression domain to construct parameter and computation efficient language models. We use an efficient transformer-based architecture with adaptive embedding and softmax, differentiable non-parametric cache, Hebbian softmax, knowledge distillation, network pruning, and low-bit quantization. In this paper, we provide the winning solution to the NeurIPS 2019 MicroNet Challenge in the language modeling track. Compared to the baseline language model provided by the MicroNet Challenge, our model is 90 times more parameter-efficient and 36 times more computation-efficient while achieving the required test perplexity of 35 on the Wikitext-103 dataset. We hope that this work will aid future research into efficient language models, and we have released our full source code at https://github.com/mit-han-lab/neurips-micronet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes