CLAIJun 5, 2023

Structured Voronoi Sampling

arXiv:2306.03061v35 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work provides a principled approach for controlled text generation, addressing a bottleneck in gradient-based sampling for language models, though it is incremental as it builds on existing Hamiltonian Monte Carlo techniques.

The paper tackles the lack of theoretically grounded gradient-based sampling methods for language models by proposing Structured Voronoi Sampling (SVS), which uses Hamiltonian Monte Carlo to sample from discrete distributions, resulting in samples that are closer to reference distributions and better follow control targets in generation tasks.

Gradient-based sampling algorithms have demonstrated their effectiveness in text generation, especially in the context of controlled text generation. However, there exists a lack of theoretically grounded and principled approaches for this task. In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods. We use discrete distributions given by language models to define densities and develop an algorithm based on Hamiltonian Monte Carlo to sample from them. We name our gradient-based technique Structured Voronoi Sampling (SVS). In an experimental setup where the reference distribution is known, we show that the empirical distribution of SVS samples is closer to the reference distribution compared to alternative sampling schemes. Furthermore, in a controlled generation task, SVS is able to generate fluent and diverse samples while following the control targets significantly better than other methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes