CYAug 9, 2023
Targeted and Troublesome: Tracking and Advertising on Children's WebsitesZahra Moti, Asuman Senol, Hamid Bostani et al.
On the modern web, trackers and advertisers frequently construct and monetize users' detailed behavioral profiles without consent. Despite various studies on web tracking mechanisms and advertisements, there has been no rigorous study focusing on websites targeted at children. To address this gap, we present a measurement of tracking and (targeted) advertising on websites directed at children. Motivated by lacking a comprehensive list of child-directed (i.e., targeted at children) websites, we first build a multilingual classifier based on web page titles and descriptions. Applying this classifier to over two million pages, we compile a list of two thousand child-directed websites. Crawling these sites from five vantage points, we measure the prevalence of trackers, fingerprinting scripts, and advertisements. Our crawler detects ads displayed on child-directed websites and determines if ad targeting is enabled by scraping ad disclosure pages whenever available. Our results show that around 90% of child-directed websites embed one or more trackers, and about 27% contain targeted advertisements--a practice that should require verifiable parental consent. Next, we identify improper ads on child-directed websites by developing an ML pipeline that processes both images and text extracted from ads. The pipeline allows us to run semantic similarity queries for arbitrary search terms, revealing ads that promote services related to dating, weight loss, and mental health; as well as ads for sex toys and flirting chat services. Some of these ads feature repulsive and sexually explicit imagery. In summary, our findings indicate a trend of non-compliance with privacy regulations and troubling ad safety practices among many advertisers and child-directed websites. To protect children and create a safer online environment, regulators and stakeholders must adopt and enforce more stringent measures.
LGMay 30, 2022
Level Up with ML Vulnerability Identification: Leveraging Domain Constraints in Feature Space for Robust Android Malware DetectionHamid Bostani, Zhengyu Zhao, Zhuoran Liu et al.
Machine Learning (ML) promises to enhance the efficacy of Android Malware Detection (AMD); however, ML models are vulnerable to realistic evasion attacks--crafting realizable Adversarial Examples (AEs) that satisfy Android malware domain constraints. To eliminate ML vulnerabilities, defenders aim to identify susceptible regions in the feature space where ML models are prone to deception. The primary approach to identifying vulnerable regions involves investigating realizable AEs, but generating these feasible apps poses a challenge. For instance, previous work has relied on generating either feature-space norm-bounded AEs or problem-space realizable AEs in adversarial hardening. The former is efficient but lacks full coverage of vulnerable regions while the latter can uncover these regions by satisfying domain constraints but is known to be time-consuming. To address these limitations, we propose an approach to facilitate the identification of vulnerable regions. Specifically, we introduce a new interpretation of Android domain constraints in the feature space, followed by a novel technique that learns them. Our empirical evaluations across various evasion attacks indicate effective detection of AEs using learned domain constraints, with an average of 89.6%. Furthermore, extensive experiments on different Android malware detectors demonstrate that utilizing our learned domain constraints in Adversarial Training (AT) outperforms other AT-based defenses that rely on norm-bounded AEs or state-of-the-art non-uniform perturbations. Finally, we show that retraining a malware detector with a wide variety of feature-space realizable AEs results in a 77.9% robustness improvement against realizable AEs generated by unknown problem-space transformations, with up to 70x faster training than using problem-space realizable AEs.
CRAug 27, 2024
Improving Adversarial Robustness in Android Malware Detection by Reducing the Impact of Spurious CorrelationsHamid Bostani, Zhengyu Zhao, Veelasha Moonsamy
Machine learning (ML) has demonstrated significant advancements in Android malware detection (AMD); however, the resilience of ML against realistic evasion attacks remains a major obstacle for AMD. One of the primary factors contributing to this challenge is the scarcity of reliable generalizations. Malware classifiers with limited generalizability tend to overfit spurious correlations derived from biased features. Consequently, adversarial examples (AEs), generated by evasion attacks, can modify these features to evade detection. In this study, we propose a domain adaptation technique to improve the generalizability of AMD by aligning the distribution of malware samples and AEs. Specifically, we utilize meaningful feature dependencies, reflecting domain constraints in the feature space, to establish a robust feature space. Training on the proposed robust feature space enables malware classifiers to learn from predefined patterns associated with app functionality rather than from individual features. This approach helps mitigate spurious correlations inherent in the initial feature space. Our experiments conducted on DREBIN, a renowned Android malware detector, demonstrate that our approach surpasses the state-of-the-art defense, Sec-SVM, when facing realistic evasion attacks. In particular, our defense can improve adversarial robustness by up to 55% against realistic evasion attacks compared to Sec-SVM.
CRSep 9, 2020Code
Where's Crypto?: Automated Identification and Classification of Proprietary Cryptographic Primitives in Binary CodeCarlo Meijer, Veelasha Moonsamy, Jos Wetzels
The continuing use of proprietary cryptography in embedded systems across many industry verticals, from physical access control systems and telecommunications to machine-to-machine authentication, presents a significant obstacle to black-box security-evaluation efforts. In-depth security analysis requires locating and classifying the algorithm in often very large binary images, thus rendering manual inspection, even when aided by heuristics, time consuming. In this paper, we present a novel approach to automate the identification and classification of (proprietary) cryptographic primitives within binary code. Our approach is based on Data Flow Graph (DFG) isomorphism, previously proposed by Lestringant et al. Unfortunately, their DFG isomorphism approach is limited to known primitives only, and relies on heuristics for selecting code fragments for analysis. By combining the said approach with symbolic execution, we overcome all limitations of their work, and are able to extend the analysis into the domain of unknown, proprietary cryptographic primitives. To demonstrate that our proposal is practical, we develop various signatures, each targeted at a distinct class of cryptographic primitives, and present experimental evaluations for each of them on a set of binaries, both publicly available (and thus providing reproducible results), and proprietary ones. Lastly, we provide a free and open-source implementation of our approach, called Where's Crypto?, in the form of a plug-in for the popular IDA disassembler.
LGDec 24, 2024
On the Effectiveness of Adversarial Training on Malware ClassifiersHamid Bostani, Jacopo Cortellazzi, Daniel Arp et al.
Adversarial Training (AT) has been widely applied to harden learning-based classifiers against adversarial evasive attacks. However, its effectiveness in identifying and strengthening vulnerable areas of the model's decision space while maintaining high performance on clean data of malware classifiers remains an under-explored area. In this context, the robustness that AT achieves has often been assessed against unrealistic or weak adversarial attacks, which negatively affect performance on clean data and are arguably no longer threats. Previous work seems to suggest robustness is a task-dependent property of AT. We instead argue it is a more complex problem that requires exploring AT and the intertwined roles played by certain factors within data, feature representations, classifiers, and robust optimization settings, as well as proper evaluation factors, such as the realism of evasion attacks, to gain a true sense of AT's effectiveness. In our paper, we address this gap by systematically exploring the role such factors have in hardening malware classifiers through AT. Contrary to recent prior work, a key observation of our research and extensive experiments confirm the hypotheses that all such factors influence the actual effectiveness of AT, as demonstrated by the varying degrees of success from our empirical analysis. We identify five evaluation pitfalls that affect state-of-the-art studies and summarize our insights in ten takeaways to draw promising research directions toward better understanding the factors' settings under which adversarial training works at best.
CRDec 3, 2021
IRShield: A Countermeasure Against Adversarial Physical-Layer Wireless SensingPaul Staat, Simon Mulzer, Stefan Roth et al.
Wireless radio channels are known to contain information about the surrounding propagation environment, which can be extracted using established wireless sensing methods. Thus, today's ubiquitous wireless devices are attractive targets for passive eavesdroppers to launch reconnaissance attacks. In particular, by overhearing standard communication signals, eavesdroppers obtain estimations of wireless channels which can give away sensitive information about indoor environments. For instance, by applying simple statistical methods, adversaries can infer human motion from wireless channel observations, allowing to remotely monitor premises of victims. In this work, building on the advent of intelligent reflecting surfaces (IRSs), we propose IRShield as a novel countermeasure against adversarial wireless sensing. IRShield is designed as a plug-and-play privacy-preserving extension to existing wireless networks. At the core of IRShield, we design an IRS configuration algorithm to obfuscate wireless channels. We validate the effectiveness with extensive experimental evaluations. In a state-of-the-art human motion detection attack using off-the-shelf Wi-Fi devices, IRShield lowered detection rates to 5% or less.
LGOct 7, 2021
EvadeDroid: A Practical Evasion Attack on Machine Learning for Black-box Android Malware DetectionHamid Bostani, Veelasha Moonsamy
Over the last decade, researchers have extensively explored the vulnerabilities of Android malware detectors to adversarial examples through the development of evasion attacks; however, the practicality of these attacks in real-world scenarios remains arguable. The majority of studies have assumed attackers know the details of the target classifiers used for malware detection, while in reality, malicious actors have limited access to the target classifiers. This paper introduces EvadeDroid, a problem-space adversarial attack designed to effectively evade black-box Android malware detectors in real-world scenarios. EvadeDroid constructs a collection of problem-space transformations derived from benign donors that share opcode-level similarity with malware apps by leveraging an n-gram-based approach. These transformations are then used to morph malware instances into benign ones via an iterative and incremental manipulation strategy. The proposed manipulation technique is a query-efficient optimization algorithm that can find and inject optimal sequences of transformations into malware apps. Our empirical evaluations, carried out on 1K malware apps, demonstrate the effectiveness of our approach in generating real-world adversarial examples in both soft- and hard-label settings. Our findings reveal that EvadeDroid can effectively deceive diverse malware detectors that utilize different features with various feature types. Specifically, EvadeDroid achieves evasion rates of 80%-95% against DREBIN, Sec-SVM, ADE-MA, MaMaDroid, and Opcode-SVM with only 1-9 queries. Furthermore, we show that the proposed problem-space adversarial attack is able to preserve its stealthiness against five popular commercial antiviruses with an average of 79% evasion rate, thus demonstrating its feasibility in the real world.
CRJul 16, 2020
Less is More: A privacy-respecting Android malware classifier using Federated LearningRafa Gálvez, Veelasha Moonsamy, Claudia Diaz
In this paper we present LiM ("Less is More"), a malware classification framework that leverages Federated Learning to detect and classify malicious apps in a privacy-respecting manner. Information about newly installed apps is kept locally on users' devices, so that the provider cannot infer which apps were installed by users. At the same time, input from all users is taken into account in the federated learning process and they all benefit from better classification performance. A key challenge of this setting is that users do not have access to the ground truth (i.e. they cannot correctly identify whether an app is malicious). To tackle this, LiM uses a safe semi-supervised ensemble that maximizes classification accuracy with respect to a baseline classifier trained by the service provider (i.e. the cloud). We implement LiM and show that the cloud server has F1 score of 95%, while clients have perfect recall with only 1 false positive in >100 apps, using a dataset of 25K clean apps and 25K malicious apps, 200 users and 50 rounds of federation. Furthermore, we conduct a security analysis and demonstrate that LiM is robust against both poisoning attacks by adversaries who control half of the clients, and inference attacks performed by an honest-but-curious cloud server. Further experiments with MaMaDroid's dataset confirm resistance against poisoning attacks and a performance improvement due to the federation.
CYJan 29, 2019
Malicious cryptocurrency miners: Status and OutlookRadhesh Krishnan Konoth, Rolf van Wegberg, Veelasha Moonsamy et al.
In this study, we examine the behavior and profitability of modern malware that mines cryptocurrency. Unlike previous studies, we look at the cryptocurrency market as a whole, rather than just Bitcoin. We not only consider PCs, but also mobile phones, and IoT devices. In the past few years, criminals have attacked all these platforms for the purpose of cryptocurrency mining. The question is: how much money do they make? It is common knowledge that mining Bitcoin is now very difficult, so why do the criminals even target low-end devices for mining purposes? By analyzing the most important families of malicious cryptocurrency miners that were active between 2014 and 2017, we are able to report how they work, which currency they mine, and how profitable it is to do so. We will see that the evolution of the cryptocurrency market with many new cryptocurrencies that are still CPU minable and offer better privacy to criminals and have contributed to making mining malware attractive again -- with attackers generating a continuous stream of profit that in some cases may reach in the millions.
CRNov 11, 2016
Systematic Classification of Side-Channel Attacks: A Case Study for Mobile DevicesRaphael Spreitzer, Veelasha Moonsamy, Thomas Korak et al.
Side-channel attacks on mobile devices have gained increasing attention since their introduction in 2007. While traditional side-channel attacks, such as power analysis attacks and electromagnetic analysis attacks, required physical presence of the attacker as well as expensive equipment, an (unprivileged) application is all it takes to exploit the leaking information on modern mobile devices. Given the vast amount of sensitive information that are stored on smartphones, the ramifications of side-channel attacks affect both the security and privacy of users and their devices. In this paper, we propose a new categorization system for side-channel attacks, which is necessary as side-channel attacks have evolved significantly since their scientific investigations during the smart card era in the 1990s. Our proposed classification system allows to analyze side-channel attacks systematically, and facilitates the development of novel countermeasures. Besides this new categorization system, the extensive survey of existing attacks and attack strategies provides valuable insights into the evolving field of side-channel attacks, especially when focusing on mobile devices. We conclude by discussing open issues and challenges in this context and outline possible future research directions.
CRSep 9, 2016
No Free Charge Theorem: a Covert Channel via USB Charging Cable on Mobile DevicesRiccardo Spolaor, Laila Abudahi, Veelasha Moonsamy et al.
More and more people are regularly using mobile and battery-powered handsets, such as smartphones and tablets. At the same time, thanks to the technological innovation and to the high user demands, those devices are integrating extensive functionalities and developers are writing battery-draining apps, which results in a surge of energy consumption of these devices. This scenario leads many people to often look for opportunities to charge their devices at public charging stations: the presence of such stations is already prominent around public areas such as hotels, shopping malls, airports, gyms and museums, and is expected to significantly grow in the future. While most of the time the power comes for free, there is no guarantee that the charging station is not maliciously controlled by an adversary, with the intention to exfiltrate data from the devices that are connected to it. In this paper, we illustrate for the first time how an adversary could leverage a maliciously controlled charging station to exfiltrate data from the smartphone via a USB charging cable (i.e., without using the data transfer functionality), controlling a simple app running on the device, and without requiring any permission to be granted by the user to send data out of the device. We show the feasibility of the proposed attack through a prototype implementation in Android, which is able to send out potentially sensitive information, such as IMEI, contacts' phone number, and pictures.