CL LGNov 2, 2020

Semi-supervised Autoencoding Projective Dependency Parsing

arXiv:2011.00704v131.0990 citations

Originality Incremental advance

AI Analysis

This work addresses dependency parsing for NLP applications by leveraging unlabeled data, but it is incremental as it builds on existing autoencoding and semi-supervised approaches.

The authors tackled semi-supervised dependency parsing by proposing two autoencoding models (LAP and GAP) that use latent variables to encode inputs, showing they improve performance with limited labeled data and outperform a previous semi-supervised model on WSJ and UD datasets.

We describe two end-to-end autoencoding models for semi-supervised graph-based projective dependency parsing. The first model is a Locally Autoencoding Parser (LAP) encoding the input using continuous latent variables in a sequential manner; The second model is a Globally Autoencoding Parser (GAP) encoding the input into dependency trees as latent variables, with exact inference. Both models consist of two parts: an encoder enhanced by deep neural networks (DNN) that can utilize the contextual information to encode the input into latent variables, and a decoder which is a generative model able to reconstruct the input. Both LAP and GAP admit a unified structure with different loss functions for labeled and unlabeled data with shared parameters. We conducted experiments on WSJ and UD dependency parsing data sets, showing that our models can exploit the unlabeled data to improve the performance given a limited amount of labeled data, and outperform a previously proposed semi-supervised model.

View on arXiv PDF

Similar