CR LGFeb 8, 2018

PoTrojan: powerful neural-level trojan designs in deep learning models

Minhui Zou, Yang Shi, Chengliang Wang, Fangyu Li, WenZhan Song, Yu Wang

arXiv:1802.03043v227.863 citations

Originality Highly original

AI Analysis

This addresses a security vulnerability in deep learning models that could pose significant risks in AI applications, representing a novel contribution to the field.

The paper tackles the threat of malicious neural networks by introducing PoTrojan, a method to design and insert neural-level trojans into pre-trained models without altering their architecture or parameters, which can cause malfunctions when triggered under rare conditions.

With the popularity of deep learning (DL), artificial intelligence (AI) has been applied in many areas of human life. Neural network or artificial neural network (NN), the main technique behind DL, has been extensively studied to facilitate computer vision and natural language recognition. However, the more we rely on information technology, the more vulnerable we are. That is, malicious NNs could bring huge threat in the so-called coming AI era. In this paper, for the first time in the literature, we propose a novel approach to design and insert powerful neural-level trojans or PoTrojan in pre-trained NN models. Most of the time, PoTrojans remain inactive, not affecting the normal functions of their host NN models. PoTrojans could only be triggered in very rare conditions. Once activated, however, the PoTrojans could cause the host NN models to malfunction, either falsely predicting or classifying, which is a significant threat to human society of the AI era. We would explain the principles of PoTrojans and the easiness of designing and inserting them in pre-trained deep learning models. PoTrojans doesn't modify the existing architecture or parameters of the pre-trained models, without re-training. Hence, the proposed method is very efficient.

View on arXiv PDF

Similar