ArcMark: Multi-bit LLM Watermark via Optimal Transport
This work addresses the need for efficient and accurate multi-bit watermarking in language models, providing a foundational advancement that could influence watermark design across AI applications.
The paper tackled the problem of determining the information-theoretic capacity for multi-bit watermarking in language models, deriving the first characterization and introducing ArcMark, a watermark construction that achieves this capacity and outperforms existing methods in bit rate per token and detection accuracy.
Watermarking is an important tool for promoting the responsible use of language models (LMs). Existing watermarks insert a signal into generated tokens that either flags LM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent multi-bit watermarks insert several bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. Notably, the information-theoretic capacity of multi-bit watermarking -- the maximum number of bits per token that can be inserted and detected without changing average next-token predictions -- has remained unknown. We address this gap by deriving the first capacity characterization of multi-bit watermarks. Our results inform the design of ArcMark: a new watermark construction based on coding-theoretic principles that, under certain assumptions, achieves the capacity of the multi-bit watermark channel. In practice, ArcMark outperforms competing multi-bit watermarks in terms of bit rate per token and detection accuracy. Our work demonstrates that LM watermarking is fundamentally a channel coding problem, paving the way for principled coding-theoretic approaches to watermark design.