Edge-Linear First-Order Dependency Parsing with Undirected Minimum Spanning Tree Inference
This addresses efficiency issues for NLP practitioners by providing a faster parsing method, though it is incremental as it builds on existing MST-based approaches.
The paper tackles the super-linear runtime complexity of graph-based dependency parsing by proposing an inference algorithm that reduces it to O(m) (edges) using undirected minimum spanning trees, with experiments on 18 languages showing performance similar to the original O(n^2) parser.
The run time complexity of state-of-the-art inference algorithms in graph-based dependency parsing is super-linear in the number of input words (n). Recently, pruning algorithms for these models have shown to cut a large portion of the graph edges, with minimal damage to the resulting parse trees. Solving the inference problem in run time complexity determined solely by the number of edges (m) is hence of obvious importance. We propose such an inference algorithm for first-order models, which encodes the problem as a minimum spanning tree (MST) problem in an undirected graph. This allows us to utilize state-of-the-art undirected MST algorithms whose run time is O(m) at expectation and with a very high probability. A directed parse tree is then inferred from the undirected MST and is subsequently improved with respect to the directed parsing model through local greedy updates, both steps running in O(n) time. In experiments with 18 languages, a variant of the first-order MSTParser (McDonald et al., 2005b) that employs our algorithm performs very similarly to the original parser that runs an O(n^2) directed MST inference.