LGCRApr 10, 2024

Disguised Copyright Infringement of Latent Diffusion Models

arXiv:2404.06737v410 citationsh-index: 5Has CodeICML
Originality Incremental advance
AI Analysis

This addresses a critical issue for AI developers and legal stakeholders by exposing a hidden form of copyright infringement that current visual auditing methods overlook, offering practical detection tools.

The paper tackles the problem of disguised copyright infringement in Latent Diffusion Models, where copyrighted data is indirectly accessed via visually distinct disguises that evade current auditing tools, and it provides methods to generate, reveal, and detect such disguises to enhance detection capabilities.

Copyright infringement may occur when a generative model produces samples substantially similar to some copyrighted data that it had access to during the training phase. The notion of access usually refers to including copyrighted samples directly in the training dataset, which one may inspect to identify an infringement. We argue that such visual auditing largely overlooks a concealed copyright infringement, where one constructs a disguise that looks drastically different from the copyrighted sample yet still induces the effect of training Latent Diffusion Models on it. Such disguises only require indirect access to the copyrighted material and cannot be visually distinguished, thus easily circumventing the current auditing tools. In this paper, we provide a better understanding of such disguised copyright infringement by uncovering the disguises generation algorithm, the revelation of the disguises, and importantly, how to detect them to augment the existing toolbox. Additionally, we introduce a broader notion of acknowledgment for comprehending such indirect access. Our code is available at https://github.com/watml/disguised_copyright_infringement.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes