Vicinity-Driven Paragraph and Sentence Alignment for Comparable Corpora
This addresses the need for more versatile alignment methods in text simplification, though it appears incremental as it builds on existing alignment approaches.
The paper tackled the problem of limited alignment types and ignored clues in comparable corpora for text simplification by introducing flexible vicinity-driven paragraph and sentence alignment algorithms, achieving support for 1-N, N-1, N-N, and long-distance null alignments without supervised models.
Parallel corpora have driven great progress in the field of Text Simplification. However, most sentence alignment algorithms either offer a limited range of alignment types supported, or simply ignore valuable clues present in comparable documents. We address this problem by introducing a new set of flexible vicinity-driven paragraph and sentence alignment algorithms that 1-N, N-1, N-N and long distance null alignments without the need for hard-to-replicate supervised models.