CLLGApr 25, 2025

TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation

arXiv:2504.18535v27 citationsh-index: 41Has CodeICML
Originality Highly original
AI Analysis

This addresses the need for flexible and efficient control of language models for alignment and customization, offering a novel solution to a known bottleneck.

The paper tackles the problem of controlling language model outputs for global attributes like detoxification and personalization, introducing TRACE which efficiently computes Expected Attribute Probability using a Hidden Markov Model and achieves state-of-the-art detoxification with 20% decoding overhead and enables low-resource personalization in seconds.

As large language models (LMs) advance, there is an increasing need to control their outputs to align with human values (e.g., detoxification) or desired attributes (e.g., personalization, topic). However, autoregressive models focus on next-token predictions and struggle with global properties that require looking ahead. Existing solutions either post-train LMs for each new attribute--expensive and inflexible--or approximate the Expected Attribute Probability (EAP) of future sequences by sampling or training, which is slow and unreliable for rare attributes. We introduce TRACE (Tractable Probabilistic Reasoning for Adaptable Controllable gEneration), a novel framework that efficiently computes EAP and adapts to new attributes through tractable probabilistic reasoning and lightweight control. TRACE distills a Hidden Markov Model (HMM) from an LM and pairs it with a small classifier to estimate attribute probabilities, enabling exact EAP computation over the HMM's predicted futures. This EAP is then used to reweigh the LM's next-token probabilities for globally compliant continuations. Empirically, TRACE achieves state-of-the-art detoxification results with only 20% decoding overhead, yields 76 low-resource personalized LMs within seconds, and seamlessly extends to composite attributes. Our code is available at: https://github.com/yidouweng/trace.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes