CLLGOct 13, 2016

Compressing Neural Language Models by Sparse Word Representations

arXiv:1610.03950v129 citations
Originality Incremental advance
AI Analysis

This addresses efficiency issues for users of neural language models, though it appears incremental as it builds on existing dense representation methods.

The paper tackles the problem of neural language models being time- and memory-intensive due to large parameter counts by proposing compression via sparse word representations, resulting in a model where parameter growth with vocabulary size is almost imperceptible and performance improves in terms of perplexity.

Neural networks are among the state-of-the-art techniques for language modeling. Existing neural language models typically map discrete words to distributed, dense vector representations. After information processing of the preceding context words by hidden layers, an output layer estimates the probability of the next word. Such approaches are time- and memory-intensive because of the large numbers of parameters for word embeddings and the output layer. In this paper, we propose to compress neural language models by sparse word representations. In the experiments, the number of parameters in our model increases very slowly with the growth of the vocabulary size, which is almost imperceptible. Moreover, our approach not only reduces the parameter space to a large extent, but also improves the performance in terms of the perplexity measure.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes