Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes
This work highlights a critical vulnerability in ML-based static analysis for malware detection, posing a security risk for anti-virus vendors and users.
The paper tackles the problem of evading DNN-based malware detection systems by modifying executable bytes while preserving functionality, achieving success rates near 100% against some models and up to 85% against commercial anti-viruses.
Motivated by the transformative impact of deep neural networks (DNNs) in various domains, researchers and anti-virus vendors have proposed DNNs for malware detection from raw bytes that do not require manual feature engineering. In this work, we propose an attack that interweaves binary-diversification techniques and optimization frameworks to mislead such DNNs while preserving the functionality of binaries. Unlike prior attacks, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white- and black-box settings, and found that it often achieved success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can foil over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by attacks, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.