CL LGFeb 26, 2024

Multi-Bit Distortion-Free Watermarking for Large Language Models

Massieh Kordi Boroujeny, Ya Jiang, Kai Zeng, Brian Mark

arXiv:2402.16578v17.714 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the need for more informative watermarks in AI-generated text detection, though it is incremental as it builds on existing zero-bit methods.

The paper tackles the problem of embedding multiple bits of information in distortion-free watermarks for large language models, achieving low bit error rates with a computationally efficient decoder.

Methods for watermarking large language models have been proposed that distinguish AI-generated text from human-generated text by slightly altering the model output distribution, but they also distort the quality of the text, exposing the watermark to adversarial detection. More recently, distortion-free watermarking methods were proposed that require a secret key to detect the watermark. The prior methods generally embed zero-bit watermarks that do not provide additional information beyond tagging a text as being AI-generated. We extend an existing zero-bit distortion-free watermarking method by embedding multiple bits of meta-information as part of the watermark. We also develop a computationally efficient decoder that extracts the embedded information from the watermark with low bit error rate.

View on arXiv PDF

Similar