CLMLApr 7, 2019

Unsupervised Recurrent Neural Network Grammars

arXiv:1904.03746v61156 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of learning syntax and surface structure without annotated parse trees for researchers in natural language processing, representing an incremental advance by extending supervised methods to an unsupervised setting.

The paper tackled unsupervised learning of recurrent neural network grammars (RNNGs) for language modeling and constituency grammar induction, achieving performance comparable to supervised RNNGs on English and Chinese benchmarks and competitive results with neural models for tree induction.

Recurrent neural network grammars (RNNG) are generative models of language which jointly model syntax and surface structure by incrementally generating a syntax tree and sentence in a top-down, left-to-right order. Supervised RNNGs achieve strong language modeling and parsing performance, but require an annotated corpus of parse trees. In this work, we experiment with unsupervised learning of RNNGs. Since directly marginalizing over the space of latent trees is intractable, we instead apply amortized variational inference. To maximize the evidence lower bound, we develop an inference network parameterized as a neural CRF constituency parser. On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese. On constituency grammar induction, they are competitive with recent neural language models that induce tree structures from words through attention mechanisms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes