Syntax-Aware Language Modeling with Recurrent Neural Networks
This work addresses the limitation of language models for NLP researchers by enhancing character-level modeling with syntactic signals, though it is incremental as it builds on existing parsing methods.
The paper tackled the problem of neural language models lacking syntactic information by incorporating both lexical and syntactic features, resulting in consistent outperformance over standard lexical LMs in character-level language modeling, while word-level models performed similarly.
Neural language models (LMs) are typically trained using only lexical features, such as surface forms of words. In this paper, we argue this deprives the LM of crucial syntactic signals that can be detected at high confidence using existing parsers. We present a simple but highly effective approach for training neural LMs using both lexical and syntactic information, and a novel approach for applying such LMs to unparsed text using sequential Monte Carlo sampling. In experiments on a range of corpora and corpus sizes, we show our approach consistently outperforms standard lexical LMs in character-level language modeling; on the other hand, for word-level models the models are on a par with standard language models. These results indicate potential for expanding LMs beyond lexical surface features to higher-level NLP features for character-level models.