CRCLLGAug 29, 2024

WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks

arXiv:2409.04459v27 citationsh-index: 36
AI Analysis

This addresses the problem of intellectual property protection for EaaS providers against imitation attacks, representing an incremental improvement over prior watermarking methods.

The paper tackles the vulnerability of existing Embeddings-as-a-Service watermarks to paraphrasing attacks that clone models, and proposes a linear transformation watermark that is empirically and theoretically robust against such attacks.

Embeddings-as-a-Service (EaaS) is a service offered by large language model (LLM) developers to supply embeddings generated by LLMs. Previous research suggests that EaaS is prone to imitation attacks -- attacks that clone the underlying EaaS model by training another model on the queried embeddings. As a result, EaaS watermarks are introduced to protect the intellectual property of EaaS providers. In this paper, we first show that existing EaaS watermarks can be removed by paraphrasing when attackers clone the model. Subsequently, we propose a novel watermarking technique that involves linearly transforming the embeddings, and show that it is empirically and theoretically robust against paraphrasing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes