CRSep 16, 2021

Protect the Intellectual Property of Dataset against Unauthorized Use

arXiv:2109.07921v17 citations
Originality Incremental advance
AI Analysis

This addresses the intellectual property protection of datasets for dataset owners, which is an incremental advancement as it extends copyright protection from models to datasets.

The paper tackles the problem of unauthorized use of datasets in training deep neural networks by proposing a method to actively protect datasets, resulting in test accuracy drops from 86.21% to 38.23% on CIFAR-10 and from 74.00% to 16.20% on TinyImageNet for unauthorized models.

Training high performance Deep Neural Networks (DNNs) models require large-scale and high-quality datasets. The expensive cost of collecting and annotating large-scale datasets make the valuable datasets can be considered as the Intellectual Property (IP) of the dataset owner. To date, almost all the copyright protection schemes for deep learning focus on the copyright protection of models, while the copyright protection of the dataset is rarely studied. In this paper, we propose a novel method to actively protect the dataset from being used to train DNN models without authorization. Experimental results on on CIFAR-10 and TinyImageNet datasets demonstrate the effectiveness of the proposed method. Compared with the model trained on clean dataset, the proposed method can effectively make the test accuracy of the unauthorized model trained on protected dataset drop from 86.21% to 38.23% and from 74.00% to 16.20% on CIFAR-10 and TinyImageNet datasets, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes