CRJan 11, 2019

Explaining Vulnerabilities of Deep Learning to Adversarial Malware Binaries

Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, Alessandro Armando

arXiv:1901.03583v230.9142 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a critical security issue for malware detection systems by revealing model vulnerabilities, but it is incremental as it builds on existing explainable techniques and attack methods.

The paper tackled the problem of why deep learning models for malware detection are vulnerable to adversarial examples by using explainable AI to analyze a convolutional neural network, finding it relied on file header features rather than meaningful data sections, and proposed a novel attack that manipulates only tens of bytes in the header, making it more efficient than state-of-the-art methods.

Recent work has shown that deep-learning algorithms for malware detection are also susceptible to adversarial examples, i.e., carefully-crafted perturbations to input malware that enable misleading classification. Although this has questioned their suitability for this task, it is not yet clear why such algorithms are easily fooled also in this particular application domain. In this work, we take a first step to tackle this issue by leveraging explainable machine-learning algorithms developed to interpret the black-box decisions of deep neural networks. In particular, we use an explainable technique known as feature attribution to identify the most influential input features contributing to each decision, and adapt it to provide meaningful explanations to the classification of malware binaries. In this case, we find that a recently-proposed convolutional neural network does not learn any meaningful characteristic for malware detection from the data and text sections of executable files, but rather tends to learn to discriminate between benign and malware samples based on the characteristics found in the file header. Based on this finding, we propose a novel attack algorithm that generates adversarial malware binaries by only changing few tens of bytes in the file header. With respect to the other state-of-the-art attack algorithms, our attack does not require injecting any padding bytes at the end of the file, and it is much more efficient, as it requires manipulating much fewer bytes.

View on arXiv PDF Code

Similar