CL AIJun 17, 2024

From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models

Harsh Nishant Lalai, Aashish Anantha Ramakrishnan, Raj Sanjay Shah, Dongwon Lee

arXiv:2406.11106v29.112 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for safeguarding text authorship in LLMs, but it is incremental as it synthesizes existing research rather than introducing new methods.

The paper tackles the problem of protecting textual content from unauthorized use by providing a comprehensive survey and taxonomy of text watermarking techniques for Large Language Models, analyzing intentions, datasets, and methods to highlight gaps and challenges.

With the rapid growth of Large Language Models (LLMs), safeguarding textual content against unauthorized use is crucial. Watermarking offers a vital solution, protecting both - LLM-generated and plain text sources. This paper presents a unified overview of different perspectives behind designing watermarking techniques through a comprehensive survey of the research literature. Our work has two key advantages: (1) We analyze research based on the specific intentions behind different watermarking techniques, evaluation datasets used, and watermarking addition and removal methods to construct a cohesive taxonomy. (2) We highlight the gaps and open challenges in text watermarking to promote research protecting text authorship. This extensive coverage and detailed analysis sets our work apart, outlining the evolving landscape of text watermarking in Language Models.

View on arXiv PDF

Similar