LG AIJan 27

Perturbation-Induced Linearization: Constructing Unlearnable Data with Solely Linear Classifiers

arXiv:2601.19967v1

Originality Incremental advance

AI Analysis

This work addresses data protection for web data collectors by offering a more efficient approach, though it is incremental as it builds on existing unlearnable example methods.

The paper tackles the problem of unauthorized data usage in deep learning by proposing Perturbation-Induced Linearization (PIL), a method that generates unlearnable data using only linear classifiers, achieving comparable or better performance than existing methods while dramatically reducing computational time.

Collecting web data to train deep models has become increasingly common, raising concerns about unauthorized data usage. To mitigate this issue, unlearnable examples introduce imperceptible perturbations into data, preventing models from learning effectively. However, existing methods typically rely on deep neural networks as surrogate models for perturbation generation, resulting in significant computational costs. In this work, we propose Perturbation-Induced Linearization (PIL), a computationally efficient yet effective method that generates perturbations using only linear surrogate models. PIL achieves comparable or better performance than existing surrogate-based methods while reducing computational time dramatically. We further reveal a key mechanism underlying unlearnable examples: inducing linearization to deep models, which explains why PIL can achieve competitive results in a very short time. Beyond this, we provide an analysis about the property of unlearnable examples under percentage-based partial perturbation. Our work not only provides a practical approach for data protection but also offers insights into what makes unlearnable examples effective.

View on arXiv PDF

Similar