Self-Training for Unsupervised Parsing with PRPN
This work addresses the challenge of parsing without syntactic annotations for natural language processing, offering incremental improvements in unsupervised parsing performance.
The paper tackles the problem of unsupervised parsing by proposing self-training for neural models, leveraging model-predicted annotations as supervision, and achieves an 8.1% F1 improvement over the PRPN baseline and a 1.6% F1 gain over previous state-of-the-art.
Neural unsupervised parsing (UP) models learn to parse without access to syntactic annotations, while being optimized for another task like language modeling. In this work, we propose self-training for neural UP models: we leverage aggregated annotations predicted by copies of our model as supervision for future copies. To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion. We then add examples with parses predicted by our model to our unlabeled UP training data. Our self-trained model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1. In addition, we show that our architecture can also be helpful for semi-supervised parsing in ultra-low-resource settings.