LG CR MLFeb 14, 2018

Security Analysis and Enhancement of Model Compressed Deep Learning Systems under Adversarial Attacks

Qi Liu, Tao Liu, Zihao Liu, Yanzhi Wang, Yier Jin, Wujie Wen

arXiv:1802.05193v29.149 citations

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in practical compressed deep learning systems, which is an incremental but important step for deploying robust AI in hardware-constrained environments.

This paper investigates adversarial attacks on compressed deep learning systems, considering both input perturbations and model reshaping from compression techniques like HashNet. It proposes a defense method called 'gradient inhibition' that reduces adversarial attack success rates from 87.99% to 4.77% on MNIST and from 86.74% to 4.64% on CIFAR-10 with minimal accuracy loss.

DNN is presenting human-level performance for many complex intelligent tasks in real-world applications. However, it also introduces ever-increasing security concerns. For example, the emerging adversarial attacks indicate that even very small and often imperceptible adversarial input perturbations can easily mislead the cognitive function of deep learning systems (DLS). Existing DNN adversarial studies are narrowly performed on the ideal software-level DNN models with a focus on single uncertainty factor, i.e. input perturbations, however, the impact of DNN model reshaping on adversarial attacks, which is introduced by various hardware-favorable techniques such as hash-based weight compression during modern DNN hardware implementation, has never been discussed. In this work, we for the first time investigate the multi-factor adversarial attack problem in practical model optimized deep learning systems by jointly considering the DNN model-reshaping (e.g. HashNet based deep compression) and the input perturbations. We first augment adversarial example generating method dedicated to the compressed DNN models by incorporating the software-based approaches and mathematical modeled DNN reshaping. We then conduct a comprehensive robustness and vulnerability analysis of deep compressed DNN models under derived adversarial attacks. A defense technique named "gradient inhibition" is further developed to ease the generating of adversarial examples thus to effectively mitigate adversarial attacks towards both software and hardware-oriented DNNs. Simulation results show that "gradient inhibition" can decrease the average success rate of adversarial attacks from 87.99% to 4.77% (from 86.74% to 4.64%) on MNIST (CIFAR-10) benchmark with marginal accuracy degradation across various DNNs.

View on arXiv PDF

Similar