LGCLFeb 22, 2024

Towards Probabilistically-Sound Beam Search with Masked Language Models

arXiv:2402.15020v3h-index: 4
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for applications like ancient text restoration and protein engineering, though it is incremental in improving existing beam search techniques.

The paper tackles the challenge of performing probabilistically-sound beam search with masked language models (MLMs), which lack joint probability distributions, by introducing a theoretically sound method for text infilling and an inference-time modification that outperforms standard beam search under expected conditions.

Beam search with masked language models (MLMs) is challenging in part because joint probability distributions over sequences are not readily available, unlike for autoregressive models. However, estimating such distributions has important domain-specific applications such as ancient text restoration and protein engineering. Here we present probabilistically-sound methods for beam search with MLMs. First, we clarify the conditions under which it is theoretically sound to perform text infilling with MLMs using standard beam search. When these conditions fail, we provide a probabilistically-sound inference time modification with no additional computational complexity and demonstrate that it is superior to the aforementioned beam search in the expected conditions. We then present empirical results comparing several infilling approaches with MLMs across several domains. Notably, our method probes the inductive biases of MLMs and explores the surprising contextual sensitivity of mask tokens for text infilling.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes