CRAICVLGFeb 22, 2025

REFINE: Inversion-Free Backdoor Defense via Model Reprogramming

arXiv:2502.18508v125 citationsh-index: 21ICLR
Originality Incremental advance
AI Analysis

This addresses security threats in AI systems by providing a robust defense against backdoor attacks, though it is incremental as it builds on existing pre-processing-based defense paradigms.

The paper tackles backdoor attacks on deep neural networks by proposing REFINE, an inversion-free defense method using model reprogramming, which achieves effective defense while maintaining model utility across various benchmark datasets.

Backdoor attacks on deep neural networks (DNNs) have emerged as a significant security threat, allowing adversaries to implant hidden malicious behaviors during the model training phase. Pre-processing-based defense, which is one of the most important defense paradigms, typically focuses on input transformations or backdoor trigger inversion (BTI) to deactivate or eliminate embedded backdoor triggers during the inference process. However, these methods suffer from inherent limitations: transformation-based defenses often fail to balance model utility and defense performance, while BTI-based defenses struggle to accurately reconstruct trigger patterns without prior knowledge. In this paper, we propose REFINE, an inversion-free backdoor defense method based on model reprogramming. REFINE consists of two key components: \textbf{(1)} an input transformation module that disrupts both benign and backdoor patterns, generating new benign features; and \textbf{(2)} an output remapping module that redefines the model's output domain to guide the input transformations effectively. By further integrating supervised contrastive loss, REFINE enhances the defense capabilities while maintaining model utility. Extensive experiments on various benchmark datasets demonstrate the effectiveness of our REFINE and its resistance to potential adaptive attacks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes