Improving Adversarial Robustness in Weight-quantized Neural Networks
This work is significant for researchers and practitioners deploying neural networks on hardware platforms, as it aims to improve the adversarial robustness of quantized models, an incremental but important step for real-world applications.
This paper addresses the problem of adversarial vulnerability in weight-quantized neural networks. The authors propose a boundary-based retraining method combined with a nonlinear mapping method, which demonstrates improved accuracy restoration after quantization compared to baseline methods under both black-box and white-box adversarial attacks.
Neural networks are getting deeper and more computation-intensive nowadays. Quantization is a useful technique in deploying neural networks on hardware platforms and saving computation costs with negligible performance loss. However, recent research reveals that neural network models, no matter full-precision or quantized, are vulnerable to adversarial attacks. In this work, we analyze both adversarial and quantization losses and then introduce criteria to evaluate them. We propose a boundary-based retraining method to mitigate adversarial and quantization losses together and adopt a nonlinear mapping method to defend against white-box gradient-based adversarial attacks. The evaluations demonstrate that our method can better restore accuracy after quantization than other baseline methods on both black-box and white-box adversarial attacks. The results also show that adversarial training suffers quantization loss and does not cooperate well with other training methods.