CLAug 30, 2018

Direct Output Connection for a High-Rank Language Model

arXiv:1808.10143v21107 citationsHas Code
Originality Incremental advance
AI Analysis

This incremental improvement enhances language modeling for NLP tasks like machine translation and headline generation.

The paper tackles the problem of improving language model performance by proposing a method that combines probability distributions from multiple RNN layers, achieving the best scores on Penn Treebank and WikiText-2 benchmark datasets.

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also from middle layers. Our proposed method raises the expressive power of a language model based on the matrix factorization interpretation of language modeling introduced by Yang et al. (2018). The proposed method improves the current state-of-the-art language model and achieves the best score on the Penn Treebank and WikiText-2, which are the standard benchmark datasets. Moreover, we indicate our proposed method contributes to two application tasks: machine translation and headline generation. Our code is publicly available at: https://github.com/nttcslab-nlp/doc_lm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes