CLApr 16, 2025

Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation

arXiv:2504.12108v11 citationsh-index: 36

Originality Incremental advance

AI Analysis

This addresses the need for robust and traceable text generation in LLMs, offering a test-time framework that is an incremental improvement over existing watermarking methods.

The paper tackles the problem of content traceability and misuse in LLMs by proposing a novel watermarking scheme that improves detectability and text quality, achieving over 80% improvements on datasets like MATH and GSM8K while maintaining high detection accuracy.

The rapid development of Large Language Models (LLMs) has intensified concerns about content traceability and potential misuse. Existing watermarking schemes for sampled text often face trade-offs between maintaining text quality and ensuring robust detection against various attacks. To address these issues, we propose a novel watermarking scheme that improves both detectability and text quality by introducing a cumulative watermark entropy threshold. Our approach is compatible with and generalizes existing sampling functions, enhancing adaptability. Experimental results across multiple LLMs show that our scheme significantly outperforms existing methods, achieving over 80\% improvements on widely-used datasets, e.g., MATH and GSM8K, while maintaining high detection accuracy.

View on arXiv PDF

Similar