CRApr 20, 2022
Backdooring Explainable Machine LearningMaximilian Noppel, Lukas Peter, Christian Wressnegger
Explainable machine learning holds great potential for analyzing and understanding learning-based systems. These methods can, however, be manipulated to present unfaithful explanations, giving rise to powerful and stealthy adversaries. In this paper, we demonstrate blinding attacks that can fully disguise an ongoing attack against the machine learning model. Similar to neural backdoors, we modify the model's prediction upon trigger presence but simultaneously also fool the provided explanation. This enables an adversary to hide the presence of the trigger or point the explanation to entirely different portions of the input, throwing a red herring. We analyze different manifestations of such attacks for different explanation types in the image domain, before we resume to conduct a red-herring attack against malware classification.
CRApr 25, 2020Code
Aim Low, Shoot High: Evading Aimbot Detectors by Mimicking User BehaviorTim Witschel, Christian Wressnegger
Current schemes to detect cheating in online games often build on the assumption that the applied cheat takes actions that are drastically different from normal behavior. For instance, an Aimbot for a first-person shooter is used by an amateur player to increase his/her capabilities many times over. Attempts to evade detection would require to reduce the intended effect such that the advantage is presumably lowered into insignificance. We argue that this is not necessarily the case and demonstrate how a professional player is able to make use of an adaptive Aimbot that mimics user behavior to gradually increase performance and thus evades state-of-the-art detection mechanisms. We show this in a quantitative and qualitative evaluation with two professional "Counter-Strike: Global Offensive" players, two open-source Anti-Cheat systems, and the commercially established combination of VAC, VACnet, and Overwatch.
LGNov 25, 2018Code
Poisoning Behavioral Malware ClusteringBattista Biggio, Konrad Rieck, Davide Ariu et al.
Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.
LGDec 19, 2024
Holistic Adversarially Robust PruningQi Zhao, Christian Wressnegger
Neural networks can be drastically shrunk in size by removing redundant parameters. While crucial for the deployment on resource-constraint hardware, oftentimes, compression comes with a severe drop in accuracy and lack of adversarial robustness. Despite recent advances, counteracting both aspects has only succeeded for moderate compression rates so far. We propose a novel method, HARP, that copes with aggressive pruning significantly better than prior work. For this, we consider the network holistically. We learn a global compression strategy that optimizes how many parameters (compression rate) and which parameters (scoring connections) to prune specific to each layer individually. Our method fine-tunes an existing model with dynamic regularization, that follows a step-wise incremental function balancing the different objectives. It starts by favoring robustness before shifting focus on reaching the target compression rate and only then handles the objectives equally. The learned compression strategies allow us to maintain the pre-trained model natural accuracy and its adversarial robustness for a reduction by 99% of the network original size. Moreover, we observe a crucial influence of non-uniform compression across layers.
CVMar 11, 2025
Controlling Latent Diffusion Using Latent CLIPJason Becker, Chris Wendler, Peter Baylies et al.
Instead of performing text-conditioned denoising in the image domain, latent diffusion models (LDMs) operate in latent space of a variational autoencoder (VAE), enabling more efficient processing at reduced computational costs. However, while the diffusion process has moved to the latent space, the contrastive language-image pre-training (CLIP) models, as used in many image processing tasks, still operate in pixel space. Doing so requires costly VAE-decoding of latent images before they can be processed. In this paper, we introduce Latent-CLIP, a CLIP model that operates directly in the latent space. We train Latent-CLIP on 2.7B pairs of latent images and descriptive texts, and show that it matches zero-shot classification performance of similarly sized CLIP models on both the ImageNet benchmark and a LDM-generated version of it, demonstrating its effectiveness in assessing both real and generated content. Furthermore, we construct Latent-CLIP rewards for reward-based noise optimization (ReNO) and show that they match the performance of their CLIP counterparts on GenEval and T2I-CompBench while cutting the cost of the total pipeline by 21%. Finally, we use Latent-CLIP to guide generation away from harmful content, achieving strong performance on the inappropriate image prompts (I2P) benchmark and a custom evaluation, without ever requiring the costly step of decoding intermediate images.
CVOct 21, 2025
S2AP: Score-space Sharpness Minimization for Adversarial PruningGiorgio Piras, Qi Zhao, Fabio Brau et al.
Adversarial pruning methods have emerged as a powerful tool for compressing neural networks while preserving robustness against adversarial attacks. These methods typically follow a three-step pipeline: (i) pretrain a robust model, (ii) select a binary mask for weight pruning, and (iii) finetune the pruned model. To select the binary mask, these methods minimize a robust loss by assigning an importance score to each weight, and then keep the weights with the highest scores. However, this score-space optimization can lead to sharp local minima in the robust loss landscape and, in turn, to an unstable mask selection, reducing the robustness of adversarial pruning methods. To overcome this issue, we propose a novel plug-in method for adversarial pruning, termed Score-space Sharpness-aware Adversarial Pruning (S2AP). Through our method, we introduce the concept of score-space sharpness minimization, which operates during the mask search by perturbing importance scores and minimizing the corresponding robust loss. Extensive experiments across various datasets, models, and sparsity levels demonstrate that S2AP effectively minimizes sharpness in score space, stabilizing the mask selection, and ultimately improving the robustness of adversarial pruning methods.
LGAug 26, 2021
Machine Unlearning of Features and LabelsAlexander Warnecke, Lukas Pirch, Christian Wressnegger et al.
Removing information from a machine learning model is a non-trivial task that requires to partially revert the training process. This task is unavoidable when sensitive data, such as credit card numbers or passwords, accidentally enter the model and need to be removed afterwards. Recently, different concepts for machine unlearning have been proposed to address this problem. While these approaches are effective in removing individual data points, they do not scale to scenarios where larger groups of features and labels need to be reverted. In this paper, we propose the first method for unlearning features and labels. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters. It enables to adapt the influence of training data on a learning model retrospectively, thereby correcting data leaks and privacy issues. For learning models with strongly convex loss functions, our method provides certified unlearning with theoretical guarantees. For models with non-convex losses, we empirically show that unlearning features and labels is effective and significantly faster than other strategies.
CRJun 8, 2021
LaserShark: Establishing Fast, Bidirectional Communication into Air-Gapped SystemsNiclas Kühnapfel, Stefan Preußler, Maximilian Noppel et al.
Physical isolation, so called air-gapping, is an effective method for protecting security-critical computers and networks. While it might be possible to introduce malicious code through the supply chain, insider attacks, or social engineering, communicating with the outside world is prevented. Different approaches to breach this essential line of defense have been developed based on electromagnetic, acoustic, and optical communication channels. However, all of these approaches are limited in either data rate or distance, and frequently offer only exfiltration of data. We present a novel approach to infiltrate data to and exfiltrate data from air-gapped systems without any additional hardware on-site. By aiming lasers at already built-in LEDs and recording their response, we are the first to enable a long-distance (25m), bidirectional, and fast (18.2kbps in & 100kbps out) covert communication channel. The approach can be used against any office device that operates LEDs at the CPU's GPIO interface.
CROct 19, 2020
Dos and Don'ts of Machine Learning in Computer SecurityDaniel Arp, Erwin Quiring, Feargus Pendlebury et al.
With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.
CRDec 9, 2019
Political Elections Under (Social) Fire? Analysis and Detection of Propaganda on TwitterAnsgar Kellner, Lisa Rangosch, Christian Wressnegger et al.
For many, social networks have become the primary source of news, although the correctness of the provided information and its trustworthiness are often unclear. The investigations of the 2016 US presidential elections have brought the existence of external campaigns to light aiming at affecting the general political public opinion. In this paper, we investigate whether a similar influence on political elections can be observed in Europe as well. To this end, we use the past German federal election as an indicator and inspect the propaganda on Twitter, based on data from a period of 268 days. We find that 79 trolls from the US campaign have also acted upon the German federal election spreading right-wing views. Moreover, we develop a detector for finding automated behavior that enables us to identify 2,414 previously unknown bots.
LGJun 5, 2019
Evaluating Explanation Methods for Deep Learning in SecurityAlexander Warnecke, Daniel Arp, Christian Wressnegger et al.
Deep learning is increasingly used as a building block of security systems. Unfortunately, neural networks are hard to interpret and typically opaque to the practitioner. The machine learning community has started to address this problem by developing methods for explaining the predictions of neural networks. While several of these approaches have been successfully applied in the area of computer vision, their application in security has received little attention so far. It is an open question which explanation methods are appropriate for computer security and what requirements they need to satisfy. In this paper, we introduce criteria for comparing and evaluating explanation methods in the context of computer security. These cover general properties, such as the accuracy of explanations, as well as security-focused aspects, such as the completeness, efficiency, and robustness. Based on our criteria, we investigate six popular explanation methods and assess their utility in security systems for malware detection and vulnerability discovery. We observe significant differences between the methods and build on these to derive general recommendations for selecting and applying explanation methods in computer security.
CRAug 28, 2018
Web-based Cryptojacking in the WildMarius Musch, Christian Wressnegger, Martin Johns et al.
With the introduction of memory-bound cryptocurrencies, such as Monero, the implementation of mining code in browser-based JavaScript has become a worthwhile alternative to dedicated mining rigs. Based on this technology, a new form of parasitic computing, widely called cryptojacking or drive-by mining, has gained momentum in the web. A cryptojacking site abuses the computing resources of its visitors to covertly mine for cryptocurrencies. In this paper, we systematically explore this phenomenon. For this, we propose a 3-phase analysis approach, which enables us to identify mining scripts and conduct a large-scale study on the prevalence of cryptojacking in the Alexa 1 million websites. We find that cryptojacking is common, with currently 1 out of 500 sites hosting a mining script. Moreover, we perform several secondary analyses to gain insight into the cryptojacking landscape, including a measurement of code characteristics, an estimate of expected mining revenue, and an evaluation of current blacklist-based countermeasures.
CROct 19, 2016
From Malware Signatures to Anti-Virus Assisted AttacksChristian Wressnegger, Kevin Freeman, Fabian Yamaguchi et al.
Although anti-virus software has significantly evolved over the last decade, classic signature matching based on byte patterns is still a prevalent concept for identifying security threats. Anti-virus signatures are a simple and fast detection mechanism that can complement more sophisticated analysis strategies. However, if signatures are not designed with care, they can turn from a defensive mechanism into an instrument of attack. In this paper, we present a novel method for automatically deriving signatures from anti-virus software and demonstrate how the extracted signatures can be used to attack sensible data with the aid of the virus scanner itself. We study the practicability of our approach using four commercial products and exemplarily discuss a novel attack vector made possible by insufficiently designed signatures. Our research indicates that there is an urgent need to improve pattern-based signatures if used in anti-virus software and to pursue alternative detection approaches in such products.