AGGGEN: Ordering and Aggregating while Generating
This work addresses the challenge of improving control and interpretability in data-to-text generation for applications like automated reporting, though it is incremental by building on existing neural methods with added planning stages.
The paper tackles the problem of generating text from structured data by reintroducing explicit sentence planning stages (input ordering and aggregation) into neural data-to-text models, resulting in a more interpretable, expressive, and robust system that retains fluency, as shown in experiments on WebNLG and E2E datasets.
We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation. In contrast to previous work using sentence planning, our model is still end-to-end: AGGGEN performs sentence planning at the same time as generating text by learning latent alignments (via semantic facts) between input representation and target text. Experiments on the WebNLG and E2E challenge data show that by using fact-based alignments our approach is more interpretable, expressive, robust to noise, and easier to control, while retaining the advantages of end-to-end systems in terms of fluency. Our code is available at https://github.com/XinnuoXu/AggGen.