LGCLCRFeb 28, 2024

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

arXiv:2402.18059v339 citationsh-index: 68Has CodeICML
AI Analysis

This addresses the problem of regulating AI-generated misinformation for users and platforms, though it is incremental as it builds on existing watermarking methods.

The paper tackles the challenge of watermarking large language model outputs to distinguish AI-generated text, introducing a multi-objective optimization method that improves detectability while maintaining semantic coherence, with experimental results showing it outperforms current techniques.

Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the semantic quality of generated texts is challenging. While current watermarking algorithms have made promising progress in this direction, there remains significant scope for improvement. To address these challenges, we introduce a novel multi-objective optimization (MOO) approach for watermarking that utilizes lightweight networks to generate token-specific watermarking logits and splitting ratios. By leveraging MOO to optimize for both detection and semantic objective functions, our method simultaneously achieves detectability and semantic integrity. Experimental results show that our method outperforms current watermarking techniques in enhancing the detectability of texts generated by LLMs while maintaining their semantic coherence. Our code is available at https://github.com/mignonjia/TS_watermark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes