CRLGJan 15, 2022

StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning

arXiv:2201.05889v236 citations
AI Analysis

This work addresses a security vulnerability for providers of encoder-as-a-service, revealing that current defenses are insufficient to protect against model theft, which is an incremental but practical concern.

The authors tackled the problem of stealing pre-trained image encoders used in self-supervised learning, showing that their StolenEncoder attack can create stolen encoders with similar functionality to target encoders, such as achieving comparable downstream classifier accuracy on models like Google's ImageNet encoder and OpenAI's CLIP encoder.

Pre-trained encoders are general-purpose feature extractors that can be used for many downstream tasks. Recent progress in self-supervised learning can pre-train highly effective encoders using a large volume of unlabeled data, leading to the emerging encoder as a service (EaaS). A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e.g., for deepfakes generation. In this paper, we propose the first attack called StolenEncoder to steal pre-trained image encoders. We evaluate StolenEncoder on multiple target encoders pre-trained by ourselves and three real-world target encoders including the ImageNet encoder pre-trained by Google, CLIP encoder pre-trained by OpenAI, and Clarifai's General Embedding encoder deployed as a paid EaaS. Our results show that our stolen encoders have similar functionality with the target encoders. In particular, the downstream classifiers built upon a target encoder and a stolen one have similar accuracy. Moreover, stealing a target encoder using StolenEncoder requires much less data and computation resources than pre-training it from scratch. We also explore three defenses that perturb feature vectors produced by a target encoder. Our results show these defenses are not enough to mitigate StolenEncoder.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes