Your Semantic-Independent Watermark is Fragile: A Semantic Perturbation Attack against EaaS Watermark
This work addresses copyright protection challenges in EaaS by exposing a critical flaw in existing watermarking methods, which is incremental as it builds on prior schemes but highlights a new attack vector.
The paper tackles the vulnerability of backdoor-based watermarking schemes in Embedding-as-a-Service (EaaS) by revealing their semantic-independent nature and proposing a Semantic Perturbation Attack (SPA), which achieves a True Positive Rate of over 95% in bypassing watermark verification while preserving embedding utility.
Embedding-as-a-Service (EaaS) has emerged as a successful business pattern but faces significant challenges related to various forms of copyright infringement, particularly, the API misuse and model extraction attacks. Various studies have proposed backdoor-based watermarking schemes to protect the copyright of EaaS services. In this paper, we reveal that previous watermarking schemes possess semantic-independent characteristics and propose the Semantic Perturbation Attack (SPA). Our theoretical and experimental analysis demonstrate that this semantic-independent nature makes current watermarking schemes vulnerable to adaptive attacks that exploit semantic perturbations tests to bypass watermark verification. Extensive experimental results across multiple datasets demonstrate that the True Positive Rate (TPR) for identifying watermarked samples under SPA can reach up to more than 95\%, rendering watermarks ineffective while maintaining the high utility of embeddings. Furthermore, we discuss potential defense strategies to mitigate SPA. Our code is available at https://github.com/Zk4-ps/EaaS-Embedding-Watermark.