CL CC DSSep 8, 2014

Approximating solution structure of the Weighted Sentence Alignment problem

arXiv:1409.2433v1

Originality Incremental advance

AI Analysis

This provides theoretical hardness results for natural language processing researchers working on alignment problems, establishing computational limits for approximation algorithms.

The paper studies the computational complexity of approximating solution structures for weighted sentence alignment problems, showing that achieving significant agreement with optimal alignments is NP-hard for both phrases-to-words alignment (requiring more than half agreement) and general weighted sentence alignment (requiring more than two-thirds agreement).

We study the complexity of approximating solution structure of the bijective weighted sentence alignment problem of DeNero and Klein (2008). In particular, we consider the complexity of finding an alignment that has a significant overlap with an optimal alignment. We discuss ways of representing the solution for the general weighted sentence alignment as well as phrases-to-words alignment problem, and show that computing a string which agrees with the optimal sentence partition on more than half (plus an arbitrarily small polynomial fraction) positions for the phrases-to-words alignment is NP-hard. For the general weighted sentence alignment we obtain such bound from the agreement on a little over 2/3 of the bits. Additionally, we generalize the Hamming distance approximation of a solution structure to approximating it with respect to the edit distance metric, obtaining similar lower bounds.

View on arXiv PDF

Similar