CLMay 27, 2022

Controllable Text Generation with Neurally-Decomposed Oracle

arXiv:2205.14219v245 citationsh-index: 64
Originality Incremental advance
AI Analysis

This addresses the challenge of precise control in text generation for applications like constrained writing and style adaptation, representing an incremental improvement over existing methods.

The paper tackles the problem of controlling auto-regressive text generation models by proposing a framework that decomposes sequence-level boolean oracles into token-level guidance, requiring no extra labeled data. Experiments on lexical constraints and machine translation formality control show it efficiently guides base models while maintaining high quality.

We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. Based on posterior regularization, we present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes