Contrastive Unlearning: A Contrastive Approach to Machine Unlearning
This addresses the challenge of efficiently and effectively removing data influence from trained models, which is incremental as it builds on existing unlearning methods with a novel contrastive approach.
The paper tackles the problem of machine unlearning by proposing a contrastive unlearning framework that removes the influence of specific training samples by optimizing the representation space, achieving the best unlearning effects and efficiency with the lowest performance loss compared to state-of-the-art algorithms on various datasets and models.
Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effective unlearning. It removes the influence of unlearning samples by contrasting their embeddings against the remaining samples so that they are pushed away from their original classes and pulled toward other classes. By directly optimizing the representation space, it effectively removes the influence of unlearning samples while maintaining the representations learned from the remaining samples. Experiments on a variety of datasets and models on both class unlearning and sample unlearning showed that contrastive unlearning achieves the best unlearning effects and efficiency with the lowest performance loss compared with the state-of-the-art algorithms.