CLMar 12, 2018

Entity-Aware Language Model as an Unsupervised Reranker

arXiv:1803.04291v2
Originality Incremental advance
AI Analysis

This addresses the challenge of integrating external knowledge into language models for domains like music, though it is incremental as it builds on existing reranker methods.

The paper tackled the problem of incorporating entity relationships from a knowledge-base into language models without expensive manually annotated data, achieving a 0.44% absolute word error rate improvement over an LSTM language model on blind test data.

In language modeling, it is difficult to incorporate entity relationships from a knowledge-base. One solution is to use a reranker trained with global features, in which global features are derived from n-best lists. However, training such a reranker requires manually annotated n-best lists, which is expensive to obtain. We propose a method based on the contrastive estimation method that alleviates the need for such data. Experiments in the music domain demonstrate that global features, as well as features extracted from an external knowledge-base, can be incorporated into our reranker. Our final model, a simple ensemble of a language model and reranker, achieves a 0.44\% absolute word error rate improvement over an LSTM language model on the blind test data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes