LGAICLCRMar 12, 2024

Duwak: Dual Watermarks in Large Language Models

arXiv:2403.13000v232 citationsh-index: 5ACL
Originality Highly original
AI Analysis

This work addresses the need for more efficient and robust watermarking to audit and govern LLM usage, representing a novel method for a known bottleneck.

The paper tackles the problem of inefficient watermark detection in large language models by proposing Duwak, which embeds dual secret patterns in token probability distribution and sampling schemes, achieving up to 70% fewer tokens required for detection compared to existing methods.

As large language models (LLM) are increasingly used for text generation tasks, it is critical to audit their usages, govern their applications, and mitigate their potential harms. Existing watermark techniques are shown effective in embedding single human-imperceptible and machine-detectable patterns without significantly affecting generated text quality and semantics. However, the efficiency in detecting watermarks, i.e., the minimum number of tokens required to assert detection with significance and robustness against post-editing, is still debatable. In this paper, we propose, Duwak, to fundamentally enhance the efficiency and quality of watermarking by embedding dual secret patterns in both token probability distribution and sampling schemes. To mitigate expression degradation caused by biasing toward certain tokens, we design a contrastive search to watermark the sampling scheme, which minimizes the token repetition and enhances the diversity. We theoretically explain the interdependency of the two watermarks within Duwak. We evaluate Duwak extensively on Llama2 under various post-editing attacks, against four state-of-the-art watermarking techniques and combinations of them. Our results show that Duwak marked text achieves the highest watermarked text quality at the lowest required token count for detection, up to 70% tokens less than existing approaches, especially under post paraphrasing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes