CRAICVLGNov 13, 2024

Trap-MID: Trapdoor-based Defense against Model Inversion Attacks

arXiv:2411.08460v26 citationsh-index: 1Has CodeNIPS
Originality Highly original
AI Analysis

This addresses privacy protection for users of deep learning models against inversion attacks, representing a novel defense approach rather than an incremental improvement.

The paper tackles the problem of defending against Model Inversion (MI) attacks that threaten privacy in Deep Neural Networks by proposing Trap-MID, a trapdoor-based method that misleads attacks to extract triggers instead of private data, achieving state-of-the-art performance without extra data or high computational overhead.

Model Inversion (MI) attacks pose a significant threat to the privacy of Deep Neural Networks by recovering training data distribution from well-trained models. While existing defenses often rely on regularization techniques to reduce information leakage, they remain vulnerable to recent attacks. In this paper, we propose the Trapdoor-based Model Inversion Defense (Trap-MID) to mislead MI attacks. A trapdoor is integrated into the model to predict a specific label when the input is injected with the corresponding trigger. Consequently, this trapdoor information serves as the "shortcut" for MI attacks, leading them to extract trapdoor triggers rather than private data. We provide theoretical insights into the impacts of trapdoor's effectiveness and naturalness on deceiving MI attacks. In addition, empirical experiments demonstrate the state-of-the-art defense performance of Trap-MID against various MI attacks without the requirements for extra data or large computational overhead. Our source code is publicly available at https://github.com/ntuaislab/Trap-MID.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes