CRAug 17, 2020Code
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware DetectionLuca Demetrio, Scott E. Coull, Battista Biggio et al.
Recent work has shown that adversarial Windows malware samples - referred to as adversarial EXEmples in this paper - can bypass machine learning-based detection relying on static code analysis by perturbing relatively few input bytes. To preserve malicious functionality, previous attacks either add bytes to existing non-functional areas of the file, potentially limiting their effectiveness, or require running computationally-demanding validation steps to discard malware variants that do not correctly execute in sandbox environments. In this work, we overcome these limitations by developing a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks based on practical, functionality-preserving manipulations to the Windows Portable Executable (PE) file format. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section. Our experimental results show that these attacks outperform existing ones in both white-box and black-box scenarios, achieving a better trade-off in terms of evasion rate and size of the injected payload, while also enabling evasion of models that have been shown to be robust to previous attacks. To facilitate reproducibility of our findings, we open source our framework and all the corresponding attack implementations as part of the secml-malware Python library. We conclude this work by discussing the limitations of current machine learning-based malware detectors, along with potential mitigation strategies based on embedding domain knowledge coming from subject-matter experts directly into the learning process.
CRJun 17, 2020
Never Trust Your Victim: Weaponizing Vulnerabilities in Security ScannersAndrea Valenza, Gabriele Costa, Alessandro Armando
The first step of every attack is reconnaissance, i.e., to acquire information about the target. A common belief is that there is almost no risk in scanning a target from a remote location. In this paper we falsify this belief by showing that scanners are exposed to the same risks as their targets. Our methodology is based on a novel attacker model where the scan author becomes the victim of a counter-strike. We developed a working prototype, called RevOK, and we applied it to 78 scanning systems. Out of them, 36 were found vulnerable to XSS. Remarkably, RevOK also found a severe vulnerability in Metasploit Pro, a mainstream penetration testing tool.
CRMar 30, 2020
Functionality-preserving Black-box Optimization of Adversarial Windows MalwareLuca Demetrio, Battista Biggio, Giovanni Lagorio et al.
Windows malware detectors based on machine learning are vulnerable to adversarial examples, even if the attacker is only given black-box query access to the model. The main drawback of these attacks is that: (i) they are query-inefficient, as they rely on iteratively applying random transformations to the input malware; and (ii) they may also require executing the adversarial malware in a sandbox at each iteration of the optimization process, to ensure that its intrusive functionality is preserved. In this paper, we overcome these issues by presenting a novel family of black-box attacks that are both query-efficient and functionality-preserving, as they rely on the injection of benign content - which will never be executed - either at the end of the malicious file, or within some newly-created sections. Our attacks are formalized as a constrained minimization problem which also enables optimizing the trade-off between the probability of evading detection and the size of the injected payload. We empirically investigate this trade-off on two popular static Windows malware detectors, and show that our black-box attacks can bypass them with only few queries and small payloads, even when they only return the predicted labels. We also evaluate whether our attacks transfer to other commercial antivirus solutions, and surprisingly find that they can evade, on average, more than 12 commercial antivirus engines. We conclude by discussing the limitations of our approach, and its possible future extensions to target malware classifiers based on dynamic analysis.
CRJan 18, 2020
Automating the Generation of Cyber Range Virtual Scenarios with VSDLGabriele Costa, Enrico Russo, Alessandro Armando
A cyber range is an environment used for training security experts and testing attack and defence tools and procedures. Usually, a cyber range simulates one or more critical infrastructures that attacking (red) and defending (blue) teams must compromise and protect, respectively. The infrastructure can be physically assembled, but much more convenient is to rely on the Infrastructure as a Service (IaaS) paradigm. Although some modern technologies support the IaaS, the design and deployment of scenarios of interest is mostly a manual operation. As a consequence, it is a common practice to have a cyber range hosting few (sometimes only one), consolidated scenarios. However, reusing the same scenario may significantly reduce the effectiveness of the training and testing sessions. In this paper, we propose a framework for automating the definition and deployment of arbitrarily complex cyber range scenarios. The framework relies on the virtual scenario description language (VSDL), i.e., a domain-specific language for defining high-level features of the desired infrastructure while hiding low-level details. The semantics of VSDL is given in terms of constraints that must be satisfied by the virtual infrastructure. These constraints are then submitted to an SMT solver for checking the satisfiability of the specification. If satisfiable, the specification gives rise to a model that is automatically converted to a set of deployment scripts to be submitted to the IaaS provider.
CRJan 11, 2019
Explaining Vulnerabilities of Deep Learning to Adversarial Malware BinariesLuca Demetrio, Battista Biggio, Giovanni Lagorio et al.
Recent work has shown that deep-learning algorithms for malware detection are also susceptible to adversarial examples, i.e., carefully-crafted perturbations to input malware that enable misleading classification. Although this has questioned their suitability for this task, it is not yet clear why such algorithms are easily fooled also in this particular application domain. In this work, we take a first step to tackle this issue by leveraging explainable machine-learning algorithms developed to interpret the black-box decisions of deep neural networks. In particular, we use an explainable technique known as feature attribution to identify the most influential input features contributing to each decision, and adapt it to provide meaningful explanations to the classification of malware binaries. In this case, we find that a recently-proposed convolutional neural network does not learn any meaningful characteristic for malware detection from the data and text sections of executable files, but rather tends to learn to discriminate between benign and malware samples based on the characteristics found in the file header. Based on this finding, we propose a novel attack algorithm that generates adversarial malware binaries by only changing few tens of bytes in the file header. With respect to the other state-of-the-art attack algorithms, our attack does not require injecting any padding bytes at the end of the file, and it is much more efficient, as it requires manipulating much fewer bytes.
CRSep 4, 2012
Security Issues in the Android Cross-Layer ArchitectureAlessandro Armando, Alessio Merlo, Luca Verderame
The security of Android has been recently challenged by the discovery of a number of vulnerabilities involving different layers of the Android stack. We argue that such vulnerabilities are largely related to the interplay among layers composing the Android stack. Thus, we also argue that such interplay has been underestimated from a security point-of-view and a systematic analysis of the Android interplay has not been carried out yet. To this aim, in this paper we provide a simple model of the Android cross-layer interactions based on the concept of flow, as a basis for analyzing the Android interplay. In particular, our model allows us to reason about the security implications associated with the cross-layer interactions in Android, including a recently discovered vulnerability that allows a malicious application to make Android devices totally unresponsive. We used the proposed model to carry out an empirical assessment of some flows within the Android cross-layered architecture. Our experiments indicate that little control is exercised by the Android Security Framework (ASF) over cross-layer interactions in Android. In particular, we observed that the ASF lacks in discriminating the originator of a flow and sensitive security issues arise between the Android stack and the Linux kernel, thereby indicating that the attack surface of the Android platform is wider than expected.