Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing
This provides a more stable and accurate solution for unsupervised discontinuous constituency parsing, which is useful for computational linguistics researchers, though it is incremental as it builds directly on a single existing model.
The paper tackles high variance in unsupervised discontinuous constituency parsing by creating an ensemble of existing parser runs and averaging their predicted trees, resulting in performance that outperforms all baselines across three datasets.
We address unsupervised discontinuous constituency parsing, where we observe a high variance in the performance of the only previous model in the literature. We propose to build an ensemble of different runs of the existing discontinuous parser by averaging the predicted trees, to stabilize and boost performance. To begin with, we provide comprehensive computational complexity analysis (in terms of P and NP-complete) for tree averaging under different setups of binarity and continuity. We then develop an efficient exact algorithm to tackle the task, which runs in a reasonable time for all samples in our experiments. Results on three datasets show our method outperforms all baselines in all metrics; we also provide in-depth analyses of our approach.