CYAIApr 22, 2024

U Can't Gen This? A Survey of Intellectual Property Protection Methods for Data in Generative AI

arXiv:2406.15386v116 citationsh-index: 21
Originality Synthesis-oriented
AI Analysis

This addresses the ethical and legal concerns for stakeholders in AI and creative industries, but it is incremental as it focuses on systematizing existing methods rather than introducing new ones.

The paper tackles the problem of intellectual property violations in generative AI models by analyzing how these models misuse training data and proposes a taxonomy to systematically review technical solutions for data protection.

Large Generative AI (GAI) models have the unparalleled ability to generate text, images, audio, and other forms of media that are increasingly indistinguishable from human-generated content. As these models often train on publicly available data, including copyrighted materials, art and other creative works, they inadvertently risk violating copyright and misappropriation of intellectual property (IP). Due to the rapid development of generative AI technology and pressing ethical considerations from stakeholders, protective mechanisms and techniques are emerging at a high pace but lack systematisation. In this paper, we study the concerns regarding the intellectual property rights of training data and specifically focus on the properties of generative models that enable misuse leading to potential IP violations. Then we propose a taxonomy that leads to a systematic review of technical solutions for safeguarding the data from intellectual property violations in GAI.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes