Language Modeling with Power Low Rank Ensembles
This work addresses language modeling for natural language processing tasks, offering a flexible and efficient method that outperforms existing techniques, though it appears incremental as it builds upon and generalizes standard smoothing approaches.
The authors tackled the problem of n-gram language modeling by introducing power low rank ensembles (PLRE), a framework that generalizes n-gram modeling to non-integer n and includes standard smoothing techniques as special cases, resulting in improved perplexity on large corpora and higher BLEU scores in machine translation compared to state-of-the-art baselines.
We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context. Our method can be understood as a generalization of n-gram modeling to non-integer n, and includes standard techniques such as absolute discounting and Kneser-Ney smoothing as special cases. PLRE training is efficient and our approach outperforms state-of-the-art modified Kneser Ney baselines in terms of perplexity on large corpora as well as on BLEU score in a downstream machine translation task.