Learning to Obstruct Few-Shot Image Classification over Restricted Classes
This addresses security risks from bad actors using open-source models, though it is incremental as it focuses on a specific scenario.
The paper tackles the problem of making pre-trained models difficult to fine-tune for harmful applications by obstructing few-shot classification on restricted classes, achieving successful obstruction across four methods and three datasets including ImageNet, CIFAR100, and CelebA.
Advancements in open-source pre-trained backbones make it relatively easy to fine-tune a model for new tasks. However, this lowered entry barrier poses potential risks, e.g., bad actors developing models for harmful applications. A question arises: Is possible to develop a pre-trained model that is difficult to fine-tune for certain downstream tasks? To begin studying this, we focus on few-shot classification (FSC). Specifically, we investigate methods to make FSC more challenging for a set of restricted classes while maintaining the performance of other classes. We propose to meta-learn over the pre-trained backbone in a manner that renders it a ''poor initialization''. Our proposed Learning to Obstruct (LTO) algorithm successfully obstructs four FSC methods across three datasets, including ImageNet and CIFAR100 for image classification, as well as CelebA for attribute classification.