SSL-Auth: An Authentication Framework by Fragile Watermarking for Pre-trained Encoders in Self-supervised Learning
This addresses security vulnerabilities for users of pre-trained encoders in self-supervised learning, though it is incremental as it builds on existing watermarking techniques.
The paper tackles the problem of protecting intellectual property and ensuring trustworthiness of pre-trained encoders in self-supervised learning by introducing SSL-Auth, an authentication framework using fragile watermarking, which effectively detects malicious alterations without compromising encoder performance.
Self-supervised learning (SSL), a paradigm harnessing unlabeled datasets to train robust encoders, has recently witnessed substantial success. These encoders serve as pivotal feature extractors for downstream tasks, demanding significant computational resources. Nevertheless, recent studies have shed light on vulnerabilities in pre-trained encoders, including backdoor and adversarial threats. Safeguarding the intellectual property of encoder trainers and ensuring the trustworthiness of deployed encoders pose notable challenges in SSL. To bridge these gaps, we introduce SSL-Auth, the first authentication framework designed explicitly for pre-trained encoders. SSL-Auth leverages selected key samples and employs a well-trained generative network to reconstruct watermark information, thus affirming the integrity of the encoder without compromising its performance. By comparing the reconstruction outcomes of the key samples, we can identify any malicious alterations. Comprehensive evaluations conducted on a range of encoders and diverse downstream tasks demonstrate the effectiveness of our proposed SSL-Auth.