CLLGDec 5, 2023

Compositional Generalization for Data-to-Text Generation

arXiv:2312.02748v1131 citationsh-index: 86EMNLP
Originality Incremental advance
AI Analysis

This addresses a key limitation in data-to-text generation for applications requiring reliable descriptions from structured data, though it is incremental as it builds on existing methods.

The paper tackled the problem of compositional generalization in data-to-text generation, where systems fail with unseen predicate combinations, and proposed a model that clusters predicates to generate text sentence-by-sentence, achieving a 31% improvement in faithfulness over T5 baselines.

Data-to-text generation involves transforming structured data, often represented as predicate-argument tuples, into coherent textual descriptions. Despite recent advances, systems still struggle when confronted with unseen combinations of predicates, producing unfaithful descriptions (e.g. hallucinations or omissions). We refer to this issue as compositional generalisation, and it encouraged us to create a benchmark for assessing the performance of different approaches on this specific problem. Furthermore, we propose a novel model that addresses compositional generalization by clustering predicates into groups. Our model generates text in a sentence-by-sentence manner, relying on one cluster of predicates at a time. This approach significantly outperforms T5~baselines across all evaluation metrics.Notably, it achieved a 31% improvement over T5 in terms of a metric focused on maintaining faithfulness to the input.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes