CLAIApr 15, 2023

Tractable Control for Autoregressive Language Generation

arXiv:2304.07438v466 citationsh-index: 50
Originality Incremental advance
AI Analysis

This work addresses the problem of controlled text generation for users of large language models, offering a novel method to impose constraints, though it is incremental in combining existing techniques.

The paper tackles the challenge of generating text that satisfies complex lexical constraints in autoregressive language models by proposing GeLaTo, a framework using tractable probabilistic models like distilled hidden Markov models to guide generation, achieving state-of-the-art performance on benchmarks such as CommonGen with large margins over baselines.

Despite the success of autoregressive large language models in text generation, it remains a major challenge to generate text that satisfies complex constraints: sampling from the conditional distribution ${\Pr}(\text{text} | α)$ is intractable for even the simplest lexical constraints $α$. To overcome this challenge, we propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models, which we refer to as GeLaTo (Generating Language with Tractable Constraints). To demonstrate the effectiveness of this framework, we use distilled hidden Markov models, where we can efficiently compute ${\Pr}(\text{text} | α)$, to guide autoregressive generation from GPT2. GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation (e.g., CommonGen), beating various strong baselines by a large margin. Our work not only opens up new avenues for controlling large language models but also motivates the development of more expressive TPMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes