Data-to-text Generation with Entity Modeling
This work addresses data-to-text generation for domains like sports reporting, offering an incremental improvement by better modeling entities.
The paper tackled the problem of data-to-text generation by proposing an entity-centric neural architecture that dynamically updates entity-specific representations, outperforming baselines on the RotoWire benchmark and a new larger baseball dataset in automatic and human evaluations.
Recent approaches to data-to-text generation have shown great promise thanks to the use of large-scale datasets and the application of neural network architectures which are trained end-to-end. These models rely on representation learning to select content appropriately, structure it coherently, and verbalize it grammatically, treating entities as nothing more than vocabulary tokens. In this work we propose an entity-centric neural architecture for data-to-text generation. Our model creates entity-specific representations which are dynamically updated. Text is generated conditioned on the data input and entity memory representations using hierarchical attention at each time step. We present experiments on the RotoWire benchmark and a (five times larger) new dataset on the baseball domain which we create. Our results show that the proposed model outperforms competitive baselines in automatic and human evaluation.