Preventing Adversarial Use of Datasets through Fair Core-Set Construction
This addresses privacy concerns for data publishers by limiting dataset misuse, though it appears incremental as it builds on existing core-set concepts.
The paper tackles the problem of preventing adversarial use of datasets by publishing only a core-set that maintains strong performance on primary tasks but forces poor performance on unwanted tasks, demonstrating efficacy on data with methods for linear models and neural networks.
We propose improving the privacy properties of a dataset by publishing only a strategically chosen "core-set" of the data containing a subset of the instances. The core-set allows strong performance on primary tasks, but forces poor performance on unwanted tasks. We give methods for both linear models and neural networks and demonstrate their efficacy on data.