CL CR CY LGFeb 17, 2024

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He

Berkeley

arXiv:2402.11399v221.958 citationsh-index: 47Has CodeACL

Originality Incremental advance

AI Analysis

This work provides an incremental improvement for detecting machine-generated text, addressing vulnerabilities to paraphrase attacks in existing methods.

The paper tackled the problem of detecting machine-generated text by improving the robustness and speed of semantic watermarking, proposing k-SemStamp, which uses k-means clustering instead of locality-sensitive hashing to partition the semantic space, resulting in enhanced robustness and sampling efficiency while maintaining generation quality.

Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

View on arXiv PDF Code

Similar