Multi-use LLM Watermarking and the False Detection Problem
This addresses a critical reliability issue in watermarking systems for LLM-generated text, which is important for mitigating misuse risks, though it appears to be an incremental improvement on existing watermarking methods.
The paper tackles the false detection problem in LLM watermarking where using the same embedding for both detection and user identification causes unwatermarked text to be increasingly misidentified as watermarked as user capacity grows. It proposes Dual Watermarking, which jointly encodes detection and identification watermarks, reducing false positives while maintaining high detection accuracy, with experimental results validating the approach.
Digital watermarking is a promising solution for mitigating some of the risks arising from the misuse of automatically generated text. These approaches either embed non-specific watermarks to allow for the detection of any text generated by a particular sampler, or embed specific keys that allow the identification of the LLM user. However, simultaneously using the same embedding for both detection and user identification leads to a false detection problem, whereby, as user capacity grows, unwatermarked text is increasingly likely to be falsely detected as watermarked. Through theoretical analysis, we identify the underlying causes of this phenomenon. Building on these insights, we propose Dual Watermarking which jointly encodes detection and identification watermarks into generated text, significantly reducing false positives while maintaining high detection accuracy. Our experimental results validate our theoretical findings and demonstrate the effectiveness of our approach.