Unbalanced Optimal Transport for Unbalanced Word Alignment
This addresses the challenge of handling semantically divergent sentences in natural language processing, but it is incremental as it adapts existing optimal transport techniques to a specific domain.
The study tackled the problem of monolingual word alignment, particularly null alignment where words lack counterparts, by applying optimal transport methods without custom modifications. The results showed that these generic approaches were competitive with state-of-the-art specialized methods, especially on datasets with high null alignment frequencies.
Monolingual word alignment is crucial to model semantic interactions between sentences. In particular, null alignment, a phenomenon in which words have no corresponding counterparts, is pervasive and critical in handling semantically divergent sentences. Identification of null alignment is useful on its own to reason about the semantic similarity of sentences by indicating there exists information inequality. To achieve unbalanced word alignment that values both alignment and null alignment, this study shows that the family of optimal transport (OT), i.e., balanced, partial, and unbalanced OT, are natural and powerful approaches even without tailor-made techniques. Our extensive experiments covering unsupervised and supervised settings indicate that our generic OT-based alignment methods are competitive against the state-of-the-arts specially designed for word alignment, remarkably on challenging datasets with high null alignment frequencies.