CLLGAug 27, 2021

ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models

arXiv:2108.12472v1666 citations
Originality Highly original
AI Analysis

This addresses the problem of automating knowledge base construction and text generation for applications in natural language processing and data management, representing a strong incremental advance with specific performance gains.

The paper tackles the bidirectional generation between text and knowledge bases by introducing ReGen, which uses reinforcement learning to achieve state-of-the-art results on the WebNLG+ 2020 dataset, significantly improving published challenge scores for both text-to-graph and graph-to-text tasks.

Automatic construction of relevant Knowledge Bases (KBs) from text, and generation of semantically meaningful text from KBs are both long-standing goals in Machine Learning. In this paper, we present ReGen, a bidirectional generation of text and graph leveraging Reinforcement Learning (RL) to improve performance. Graph linearization enables us to re-frame both tasks as a sequence to sequence generation problem regardless of the generative direction, which in turn allows the use of Reinforcement Learning for sequence training where the model itself is employed as its own critic leading to Self-Critical Sequence Training (SCST). We present an extensive investigation demonstrating that the use of RL via SCST benefits graph and text generation on WebNLG+ 2020 and TekGen datasets. Our system provides state-of-the-art results on WebNLG+ 2020 by significantly improving upon published results from the WebNLG 2020+ Challenge for both text-to-graph and graph-to-text generation tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes