CR AIJan 13

DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection

Zhenhua Xu, Yiran Zhao, Mengting Zhong, Dezhang Kong, Changting Lin, Tong Qiao, Meng Han

arXiv:2601.08223v15.33 citationsh-index: 9

Originality Incremental advance

AI Analysis

This addresses the need for stealthy and resilient ownership verification for LLM developers, though it appears incremental as it builds on existing backdoor-based fingerprinting approaches.

The paper tackles the problem of intellectual property protection for large language models under black-box deployment by proposing DNF, a dual-layer nested fingerprinting method that achieves perfect fingerprint activation across multiple models while preserving utility.

The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens -- leading to high-perplexity inputs susceptible to filtering -- or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose \textsc{Dual-Layer Nested Fingerprinting} (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.

View on arXiv PDF

Similar