CLLGJan 20, 2021

Data-to-text Generation by Splicing Together Nearest Neighbors

arXiv:2101.08248v4664 citations
Originality Incremental advance
AI Analysis

This approach addresses data-to-text generation for applications requiring interpretability and control, though it is incremental as it builds on retrieval-based methods.

The paper tackles data-to-text generation by splicing together retrieved text segments from neighbors, learning a policy to insert or replace segments instead of generating token-by-token. It shows that this method performs on par with strong baselines in automatic and human evaluations while offering more interpretable and controllable generation.

We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes