Optimal Multi-bit Generative Watermarking Schemes Under Worst-Case False-Alarm Constraints

Yu-Shin Huang, Chao Tian, Krishna Narayanan

arXiv:2604.0875989.91 citationsh-index: 2

AI Analysis

For LLM watermarking researchers, this work resolves a theoretical gap by proving optimality of multi-bit watermarking schemes, though the improvement is incremental over prior bounds.

This paper identifies and corrects a suboptimal multi-bit generative watermarking scheme for LLMs, proposing two new constructions that achieve the previously established lower bound on miss-detection probability under worst-case false-alarm constraints, thereby fully characterizing optimal performance.

This paper considers the problem of multi-bit generative watermarking for large language models under a worst-case false-alarm constraint. Prior work established a lower bound on the achievable miss-detection probability in the finite-token regime and proposed a scheme claimed to achieve this bound. We show, however, that the proposed scheme is in fact suboptimal. We then develop two new encoding-decoding constructions that attain the previously established lower bound, thereby completely characterizing the optimal multi-bit watermarking performance. Our approach formulates the watermark design problem as a linear program and derives the structural conditions under which optimality can be achieved. In addition, we identify the failure mechanism of the previous construction and compare the tradeoffs between the two proposed schemes.

View on arXiv PDF

Similar