CLMay 21, 2023

A Framework for Bidirectional Decoding: Case Study in Morphological Inflection

arXiv:2305.12580v2131 citations
Originality Highly original
AI Analysis

This work addresses a fundamental decoding challenge in NLP, particularly for tasks like morphological inflection, offering a novel approach that improves performance on long sequences and specific dataset types.

The paper tackles the problem of sequence-to-sequence tasks by proposing a bidirectional decoding framework that generates sequences from the outside-in, setting state-of-the-art on the 2022 and 2023 shared tasks with average accuracy improvements of over 4.7 and 2.7 points, respectively.

Transformer-based encoder-decoder models that generate outputs in a left-to-right fashion have become standard for sequence-to-sequence tasks. In this paper, we propose a framework for decoding that produces sequences from the "outside-in": at each step, the model chooses to generate a token on the left, on the right, or join the left and right sequences. We argue that this is more principled than prior bidirectional decoders. Our proposal supports a variety of model architectures and includes several training methods, such as a dynamic programming algorithm that marginalizes out the latent ordering variable. Our model sets state-of-the-art (SOTA) on the 2022 and 2023 shared tasks, beating the next best systems by over 4.7 and 2.7 points in average accuracy respectively. The model performs particularly well on long sequences, can implicitly learn the split point of words composed of stem and affix, and performs better relative to the baseline on datasets that have fewer unique lemmas (but more examples per lemma).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes