DeMem: Privacy-Enhanced Robust Adversarial Learning via De-Memorization
This work addresses the critical problem of balancing privacy and robustness for secure and trustworthy AI systems, representing an incremental improvement over existing adversarial training and differential privacy methods.
The paper tackles the trade-off between adversarial robustness and privacy in machine learning models, proposing DeMem to selectively target high-risk samples, which reduces privacy leakage by up to 30% while maintaining robustness across multiple datasets and training methods.
Adversarial robustness, the ability of a model to withstand manipulated inputs that cause errors, is essential for ensuring the trustworthiness of machine learning models in real-world applications. However, previous studies have shown that enhancing adversarial robustness through adversarial training increases vulnerability to privacy attacks. While differential privacy can mitigate these attacks, it often compromises robustness against both natural and adversarial samples. Our analysis reveals that differential privacy disproportionately impacts low-risk samples, causing an unintended performance drop. To address this, we propose DeMem, which selectively targets high-risk samples, achieving a better balance between privacy protection and model robustness. DeMem is versatile and can be seamlessly integrated into various adversarial training techniques. Extensive evaluations across multiple training methods and datasets demonstrate that DeMem significantly reduces privacy leakage while maintaining robustness against both natural and adversarial samples. These results confirm DeMem's effectiveness and broad applicability in enhancing privacy without compromising robustness.