CLCRCYLGFeb 17, 2024

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Berkeley
arXiv:2402.11399v254 citationsh-index: 47ACL
Originality Incremental advance
AI Analysis

This work provides an incremental improvement for detecting machine-generated text, addressing vulnerabilities to paraphrase attacks in existing methods.

The paper tackled the problem of detecting machine-generated text by improving the robustness and speed of semantic watermarking, proposing k-SemStamp, which uses k-means clustering instead of locality-sensitive hashing to partition the semantic space, resulting in enhanced robustness and sampling efficiency while maintaining generation quality.

Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes