Rule Augmented Unsupervised Constituency Parsing
This work addresses the problem of enhancing unsupervised parsing accuracy for natural language processing applications, representing an incremental improvement by integrating linguistic knowledge.
The paper tackles unsupervised constituency parsing by incorporating syntactic grammar rules to improve syntactic structures, achieving new state-of-the-art results on MNLI and WSJ benchmarks.
Recently, unsupervised parsing of syntactic trees has gained considerable attention. A prototypical approach to such unsupervised parsing employs reinforcement learning and auto-encoders. However, no mechanism ensures that the learnt model leverages the well-understood language grammar. We propose an approach that utilizes very generic linguistic knowledge of the language present in the form of syntactic rules, thus inducing better syntactic structures. We introduce a novel formulation that takes advantage of the syntactic grammar rules and is independent of the base system. We achieve new state-of-the-art results on two benchmarks datasets, MNLI and WSJ. The source code of the paper is available at https://github.com/anshuln/Diora_with_rules.