CLCYMay 17, 2023

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

arXiv:2305.10036v3252 citations
Originality Incremental advance
AI Analysis

This addresses the copyright protection issue for companies providing EaaS, preventing significant financial losses from model theft, though it is an incremental improvement on existing watermarking techniques.

The paper tackles the problem of model extraction attacks on Large Language Models (LLMs) offered as Embedding as a Service (EaaS) by proposing EmbMarker, a backdoor watermark method that implants watermarks into embeddings to enable copyright verification, with experiments showing effective protection without compromising service quality.

Large language models (LLMs) have demonstrated powerful capabilities in both text understanding and generation. Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers. However, previous studies have shown that EaaS is vulnerable to model extraction attacks, which can cause significant losses for the owners of LLMs, as training these models is extremely expensive. To protect the copyright of LLMs for EaaS, we propose an Embedding Watermark method called EmbMarker that implants backdoors on embeddings. Our method selects a group of moderate-frequency words from a general text corpus to form a trigger set, then selects a target embedding as the watermark, and inserts it into the embeddings of texts containing trigger words as the backdoor. The weight of insertion is proportional to the number of trigger words included in the text. This allows the watermark backdoor to be effectively transferred to EaaS-stealer's model for copyright verification while minimizing the adverse impact on the original embeddings' utility. Our extensive experiments on various datasets show that our method can effectively protect the copyright of EaaS models without compromising service quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes