CRJul 27, 2023
Decoding the Secrets of Machine Learning in Malware Classification: A Deep Dive into Datasets, Feature Extraction, and Model PerformanceSavino Dambra, Yufei Han, Simone Aonzo et al.
Many studies have proposed machine-learning (ML) models for malware detection and classification, reporting an almost-perfect performance. However, they assemble ground-truth in different ways, use diverse static- and dynamic-analysis techniques for feature extraction, and even differ on what they consider a malware family. As a consequence, our community still lacks an understanding of malware classification results: whether they are tied to the nature and distribution of the collected dataset, to what extent the number of families and samples in the training dataset influence performance, and how well static and dynamic features complement each other. This work sheds light on those open questions. by investigating the key factors influencing ML-based malware detection and classification. For this, we collect the largest balanced malware dataset so far with 67K samples from 670 families (100 samples each), and train state-of-the-art models for malware detection and family classification using our dataset. Our results reveal that static features perform better than dynamic features, and that combining both only provides marginal improvement over static features. We discover no correlation between packing and classification accuracy, and that missing behaviors in dynamically-extracted features highly penalize their performance. We also demonstrate how a larger number of families to classify make the classification harder, while a higher number of samples per family increases accuracy. Finally, we find that models trained on a uniform distribution of samples per family better generalize on unseen data.
CRDec 21, 2021
Longitudinal Study of the Prevalence of Malware Evasive TechniquesLorenzo Maffia, Dario Nisi, Platon Kotzias et al.
By their very nature, malware samples employ a variety of techniques to conceal their malicious behavior and hide it from analysis tools. To mitigate the problem, a large number of different evasion techniques have been documented over the years, and PoC implementations have been collected in public frameworks, like the popular Al-Khaser. As malware authors tend to reuse existing approaches, it is common to observe the same evasive techniques in malware samples of different families. However, no measurement study has been conducted to date to assess the adoption and prevalence of evasion techniques. In this paper, we present a large-scale study, conducted by dynamically analyzing more than 180K Windows malware samples, on the evolution of evasive techniques over the years. To perform the experiments, we developed a custom Pin-based Evasive Program Profiler (Pepper), a tool capable of both detecting and circumventing 53 anti-dynamic-analysis techniques of different categories, ranging from anti-debug to virtual machine detection. To observe the phenomenon of evasion from different points of view, we employed four different datasets, including benign files, advanced persistent threat (APTs), malware samples collected over a period of five years, and a recent collection of different families submitted to VirusTotal over a one-month period.
CRSep 8, 2021
Unsupervised Detection and Clustering of Malicious TLS FlowsGibran Gomez, Platon Kotzias, Matteo Dell'Amico et al.
Malware abuses TLS to encrypt its malicious traffic, preventing examination by content signatures and deep packet inspection. Network detection of malicious TLS flows is an important, but challenging, problem. Prior works have proposed supervised machine learning detectors using TLS features. However, by trying to represent all malicious traffic, supervised binary detectors produce models that are too loose, thus introducing errors. Furthermore, they do not distinguish flows generated by different malware. On the other hand, supervised multi-class detectors produce tighter models and can classify flows by malware family, but require family labels, which are not available for many samples. To address these limitations, this work proposes a novel unsupervised approach to detect and cluster malicious TLS flows. Our approach takes as input network traces from sandboxes. It clusters similar TLS flows using 90 features that capture properties of the TLS client, TLS server, certificate, and encrypted payload; and uses the clusters to build an unsupervised detector that can assign a malicious flow to the cluster it belongs to, or determine it is benign. We evaluate our approach using 972K traces from a commercial sandbox and 35M TLS flows from a research network. Our clustering shows very high precision and recall with an F1 score of 0.993. We compare our unsupervised detector with two state-of-the-art approaches, showing that it outperforms both. The false detection rate of our detector is 0.032% measured over four months of traffic.
CROct 20, 2020
How Did That Get In My Phone? Unwanted App Distribution on Android DevicesPlaton Kotzias, Juan Caballero, Leyla Bilge
Android is the most popular operating system with billions of active devices. Unfortunately, its popularity and openness makes it attractive for unwanted apps, i.e., malware and potentially unwanted programs (PUP). In Android, app installations typically happen via the official and alternative markets, but also via other smaller and less understood alternative distribution vectors such as Web downloads, pay-per-install (PPI) services, backup restoration, bloatware, and IM tools. This work performs a thorough investigation on unwanted app distribution by quantifying and comparing distribution through different vectors. At the core of our measurements are reputation logs of a large security vendor, which include 7.9M apps observed in 12M devices between June and September 2019. As a first step, we measure that between 10% and 24% of users devices encounter at least one unwanted app, and compare the prevalence of malware and PUP. An analysis of the who-installs-who relationships between installers and child apps reveals that the Play market is the main app distribution vector, responsible for 87% of all installs and 67% of unwanted app installs, but it also has the best defenses against unwanted apps. Alternative markets distribute instead 5.7% of all apps, but over 10% of unwanted apps. Bloatware is also a significant unwanted app distribution vector with 6% of those installs. And, backup restoration is an unintentional distribution vector that may even allow unwanted apps to survive users' phone replacement. We estimate unwanted app distribution via PPI to be smaller than on Windows. Finally, we observe that Web downloads are rare, but provide a riskier proposition even compared to alternative markets.