CLLGFeb 26, 2024

Multi-Bit Distortion-Free Watermarking for Large Language Models

arXiv:2402.16578v114 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the need for more informative watermarks in AI-generated text detection, though it is incremental as it builds on existing zero-bit methods.

The paper tackles the problem of embedding multiple bits of information in distortion-free watermarks for large language models, achieving low bit error rates with a computationally efficient decoder.

Methods for watermarking large language models have been proposed that distinguish AI-generated text from human-generated text by slightly altering the model output distribution, but they also distort the quality of the text, exposing the watermark to adversarial detection. More recently, distortion-free watermarking methods were proposed that require a secret key to detect the watermark. The prior methods generally embed zero-bit watermarks that do not provide additional information beyond tagging a text as being AI-generated. We extend an existing zero-bit distortion-free watermarking method by embedding multiple bits of meta-information as part of the watermark. We also develop a computationally efficient decoder that extracts the embedded information from the watermark with low bit error rate.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes