CRAIJun 5, 2025

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models

arXiv:2506.05502v111 citationsh-index: 4ICML
Originality Highly original
AI Analysis

This addresses the need for traceable AI-generated text in applications like content moderation, offering a stealthy and multi-bit solution that improves over existing methods.

The paper tackles the problem of watermarking large language models (LLMs) by introducing StealthInk, a scheme that embeds multi-bit provenance data like userID and modelID into generated text while preserving the original distribution, enabling traceability without API access. It includes a theoretical lower bound on token requirements for detection and empirical evaluations showing effectiveness in stealthiness and resilience.

Watermarking for large language models (LLMs) offers a promising approach to identifying AI-generated text. Existing approaches, however, either compromise the distribution of original generated text by LLMs or are limited to embedding zero-bit information that only allows for watermark detection but ignores identification. We present StealthInk, a stealthy multi-bit watermarking scheme that preserves the original text distribution while enabling the embedding of provenance data, such as userID, TimeStamp, and modelID, within LLM-generated text. This enhances fast traceability without requiring access to the language model's API or prompts. We derive a lower bound on the number of tokens necessary for watermark detection at a fixed equal error rate, which provides insights on how to enhance the capacity. Comprehensive empirical evaluations across diverse tasks highlight the stealthiness, detectability, and resilience of StealthInk, establishing it as an effective solution for LLM watermarking applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes