LG CR CVNov 6, 2024

Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization

Yuhao He, Jinyu Tian, Xianwei Zheng, Li Dong, Yuanman Li, Jiantao Zhou

arXiv:2411.03752v22.6h-index: 18

Originality Incremental advance

AI Analysis

This work addresses security vulnerabilities in deep learning models by proposing a more threatening poisoning attack, which is incremental as it builds on existing poisoning methods to enhance stealth and impact.

The paper tackles the problem of making deep learning models more vulnerable to attacks by introducing a deferred poisoning attack that maintains normal performance during training and validation but increases sensitivity to evasion attacks or natural noise, achieving high stealthiness and significant performance degradation with small perturbations.

Recent studies have shown that deep learning models are very vulnerable to poisoning attacks. Many defense methods have been proposed to address this issue. However, traditional poisoning attacks are not as threatening as commonly believed. This is because they often cause differences in how the model performs on the training set compared to the validation set. Such inconsistency can alert defenders that their data has been poisoned, allowing them to take the necessary defensive actions. In this paper, we introduce a more threatening type of poisoning attack called the Deferred Poisoning Attack. This new attack allows the model to function normally during the training and validation phases but makes it very sensitive to evasion attacks or even natural noise. We achieve this by ensuring the poisoned model's loss function has a similar value as a normally trained model at each input sample but with a large local curvature. A similar model loss ensures that there is no obvious inconsistency between the training and validation accuracy, demonstrating high stealthiness. On the other hand, the large curvature implies that a small perturbation may cause a significant increase in model loss, leading to substantial performance degradation, which reflects a worse robustness. We fulfill this purpose by making the model have singular Hessian information at the optimal point via our proposed Singularization Regularization term. We have conducted both theoretical and empirical analyses of the proposed method and validated its effectiveness through experiments on image classification tasks. Furthermore, we have confirmed the hazards of this form of poisoning attack under more general scenarios using natural noise, offering a new perspective for research in the field of security.

View on arXiv PDF

Similar