CRAICLApr 8, 2024

Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging

arXiv:2404.05188v226 citationsh-index: 10Has CodeLAMPS@CCS
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of IP infringement in the open-source LLM community, highlighting a critical gap in protection methods, though it is incremental as it focuses on evaluating existing techniques under a new scenario.

The study investigates the robustness of intellectual property protection methods for large language models against model merging, finding that current watermarking techniques fail while fingerprinting techniques survive in merged models.

Model merging is a promising lightweight model empowerment technique that does not rely on expensive computing devices (e.g., GPUs) or require the collection of specific training data. Instead, it involves editing different upstream model parameters to absorb their downstream task capabilities. However, uncertified model merging can infringe upon the Intellectual Property (IP) rights of the original upstream models. In this paper, we conduct the first study on the robustness of IP protection methods under model merging scenarios. Specifically, we investigate two state-of-the-art IP protection techniques: Quantization Watermarking and Instructional Fingerprint, along with various advanced model merging technologies, such as Task Arithmetic, TIES-MERGING, and so on. Experimental results indicate that current Large Language Model (LLM) watermarking techniques cannot survive in the merged models, whereas model fingerprinting techniques can. Our research aims to highlight that model merging should be an indispensable consideration in the robustness assessment of model IP protection techniques, thereby promoting the healthy development of the open-source LLM community. Our code is available at https://github.com/ThuCCSLab/MergeGuard.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes