Wouter Joosen

CR
h-index29
11papers
1,359citations
Novelty44%
AI Score40

11 Papers

CRMar 6
A LINDDUN-based Privacy Threat Modeling Framework for GenAI

Qianying Liao, Jonah Bellemans, Laurens Sion et al.

As generative AI (GenAI) systems become increasingly prevalent across various technological stacks, the question of how such systems handle sensitive and personal data flows becomes increasingly important. Specifically, both the ability to harness and process large swaths of information as well as their stochastic nature raise key concerns related to both security and privacy. Unfortunately, while some of the traditional security threat modeling can effectively identify certain violations, privacy-related issues are often overlooked. To respond to these challenges, we introduce a novel domain-specific privacy threat modeling framework to support the privacy threat analysis of GenAI-based applications. This framework is constructed through a two-pronged approach: (1) a systematic review of the emerging literature on GenAI privacy threats, and (2) a case-driven application to a representative Chatbot system. These efforts yield a foundational GenAI privacy threat modeling framework built on LINDDUN. The new framework affects three out of the seven privacy threat types of LINDDUN and introduces 100 new GenAI examples to the knowledge base. Its effectiveness is validated on an AI Agent system, which demonstrates that a comprehensive privacy analysis can be supported by the new framework.

CRFeb 29, 2024
How to Train your Antivirus: RL-based Hardening through the Problem-Space

Ilias Tsingenopoulos, Jacopo Cortellazzi, Branislav Bošanský et al.

ML-based malware detection on dynamic analysis reports is vulnerable to both evasion and spurious correlations. In this work, we investigate a specific ML architecture employed in the pipeline of a widely-known commercial antivirus company, with the goal to harden it against adversarial malware. Adversarial training, the sole defensive technique that can confer empirical robustness, is not applicable out of the box in this domain, for the principal reason that gradient-based perturbations rarely map back to feasible problem-space programs. We introduce a novel Reinforcement Learning approach for constructing adversarial examples, a constituent part of adversarially training a model against evasion. Our approach comes with multiple advantages. It performs modifications that are feasible in the problem-space, and only those; thus it circumvents the inverse mapping problem. It also makes possible to provide theoretical guarantees on the robustness of the model against a particular set of adversarial capabilities. Our empirical exploration validates our theoretical insights, where we can consistently reach 0% Attack Success Rate after a few adversarial retraining iterations.

AIDec 20, 2023
The Adaptive Arms Race: Redefining Robustness in AI Security

Ilias Tsingenopoulos, Vera Rimmer, Davy Preuveneers et al.

Despite considerable efforts on making them robust, real-world AI-based systems remain vulnerable to decision based attacks, as definitive proofs of their operational robustness have so far proven intractable. Canonical robustness evaluation relies on adaptive attacks, which leverage complete knowledge of the defense and are tailored to bypass it. This work broadens the notion of adaptivity, which we employ to enhance both attacks and defenses, showing how they can benefit from mutual learning through interaction. We introduce a framework for adaptively optimizing black-box attacks and defenses under the competitive game they form. To assess robustness reliably, it is essential to evaluate against realistic and worst-case attacks. We thus enhance attacks and their evasive arsenal together using RL, apply the same principle to defenses, and evaluate them first independently and then jointly under a multi-agent perspective. We find that active defenses, those that dynamically control system responses, are an essential complement to model hardening against decision-based attacks; that these defenses can be circumvented by adaptive attacks, something that elicits defenses being adaptive too. Our findings, supported by an extensive theoretical and empirical investigation, confirm that adaptive adversaries pose a serious threat to black-box AI-based systems, rekindling the proverbial arms race. Notably, our approach outperforms the state-of-the-art black-box attacks and defenses, while bringing them together to render effective insights into the robustness of real-world deployed ML-based systems.

CRFeb 18, 2021
The CNAME of the Game: Large-scale Analysis of DNS-based Tracking Evasion

Yana Dimova, Gunes Acar, Lukasz Olejnik et al.

Online tracking is a whack-a-mole game between trackers who build and monetize behavioral user profiles through intrusive data collection, and anti-tracking mechanisms, deployed as a browser extension, built-in to the browser, or as a DNS resolver. As a response to pervasive and opaque online tracking, more and more users adopt anti-tracking tools to preserve their privacy. Consequently, as the information that trackers can gather on users is being curbed, some trackers are looking for ways to evade these tracking countermeasures. In this paper we report on a large-scale longitudinal evaluation of an anti-tracking evasion scheme that leverages CNAME records to include tracker resources in a same-site context, effectively bypassing anti-tracking measures that use fixed hostname-based block lists. Using historical HTTP Archive data we find that this tracking scheme is rapidly gaining traction, especially among high-traffic websites. Furthermore, we report on several privacy and security issues inherent to the technical setup of CNAME-based tracking that we detected through a combination of automated and manual analyses. We find that some trackers are using the technique against the Safari browser, which is known to include strict anti-tracking configurations. Our findings show that websites using CNAME trackers must take extra precautions to avoid leaking sensitive information to third parties.

CRJun 4, 2018
Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation

Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob et al.

In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide Tranco, an improved ranking that we offer through an online service available at https://tranco-list.eu.

CRFeb 20, 2018
Frictionless Authentication Systems: Emerging Trends, Research Challenges and Opportunities

Tim Van hamme, Vera Rimmer, Davy Preuveneers et al.

Authentication and authorization are critical security layers to protect a wide range of online systems, services and content. However, the increased prevalence of wearable and mobile devices, the expectations of a frictionless experience and the diverse user environments will challenge the way users are authenticated. Consumers demand secure and privacy-aware access from any device, whenever and wherever they are, without any obstacles. This paper reviews emerging trends and challenges with frictionless authentication systems and identifies opportunities for further research related to the enrollment of users, the usability of authentication schemes, as well as security and privacy trade-offs of mobile and wearable continuous authentication systems.

CRAug 22, 2017
Herding Vulnerable Cats: A Statistical Approach to Disentangle Joint Responsibility for Web Security in Shared Hosting

Samaneh Tajalizadehkhoob, Tom van Goethem, Maciej Korczyński et al.

Hosting providers play a key role in fighting web compromise, but their ability to prevent abuse is constrained by the security practices of their own customers. {\em Shared} hosting, offers a unique perspective since customers operate under restricted privileges and providers retain more control over configurations. We present the first empirical analysis of the distribution of web security features and software patching practices in shared hosting providers, the influence of providers on these security practices, and their impact on web compromise rates. We construct provider-level features on the global market for shared hosting -- containing 1,259 providers -- by gathering indicators from 442,684 domains. Exploratory factor analysis of 15 indicators identifies four main latent factors that capture security efforts: content security, webmaster security, web infrastructure security and web application security. We confirm, via a fixed-effect regression model, that providers exert significant influence over the latter two factors, which are both related to the software stack in their hosting environment. Finally, by means of GLM regression analysis of these factors on phishing and malware abuse, we show that the four security and software patching factors explain between 10\% and 19\% of the variance in abuse at providers, after controlling for size. For web-application security for instance, we found that when a provider moves from the bottom 10\% to the best-performing 10\%, it would experience 4 times fewer phishing incidents. We show that providers have influence over patch levels--even higher in the stack, where CMSes can run as client-side software--and that this influence is tied to a substantial reduction in abuse levels.

CRAug 21, 2017
Automated Website Fingerprinting through Deep Learning

Vera Rimmer, Davy Preuveneers, Marc Juarez et al.

Several studies have shown that the network traffic that is generated by a visit to a website over Tor reveals information specific to the website through the timing and sizes of network packets. By capturing traffic traces between users and their Tor entry guard, a network eavesdropper can leverage this meta-data to reveal which website Tor users are visiting. The success of such attacks heavily depends on the particular set of traffic features that are used to construct the fingerprint. Typically, these features are manually engineered and, as such, any change introduced to the Tor network can render these carefully constructed features ineffective. In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. We collect a dataset comprised of more than three million network traces, which is the largest dataset of web traffic ever used for website fingerprinting, and find that the performance achieved by our deep learning approaches is comparable to known methods which include various research efforts spanning over multiple years. The obtained success rate exceeds 96% for a closed world of 100 websites and 94% for our biggest closed world of 900 classes. In our open world evaluation, the most performant deep learning model is 2% more accurate than the state-of-the-art attack. Furthermore, we show that the implicit features automatically learned by our approach are far more resilient to dynamic changes of web content over time. We conclude that the ability to automatically construct the most relevant traffic features and perform accurate traffic recognition makes our deep learning based approach an efficient, flexible and robust technique for website fingerprinting.

CRMay 22, 2014
On the effectiveness of virtualization-based security

Francesco Gadaleta, Raoul Strackx, Nick Nikiforakis et al.

Protecting commodity operating systems and applications against malware and targeted attacks has proven to be difficult. In recent years, virtualization has received attention from security researchers who utilize it to harden existing systems and provide strong security guarantees. This has lead to interesting use cases such as cloud computing where possibly sensitive data is processed on remote, third party systems. The migration and processing of data in remote servers, poses new technical and legal questions, such as which security measures should be taken to protect this data or how can it be proven that execution of code wasn't tampered with. In this paper we focus on technological aspects. We discuss the various possibilities of security within the virtualization layer and we use as a case study \HelloRootkitty{}, a lightweight invariance-enforcing framework which allows an operating system to recover from kernel-level attacks. In addition to \HelloRootkitty{}, we also explore the use of special hardware chips as a way of further protecting and guaranteeing the integrity of a virtualized system.

OSMay 22, 2014
Hello rootKitty: A lightweight invariance-enforcing framework

Francesco Gadaleta, Nick Nikiforakis, Yves Younan et al.

In monolithic operating systems, the kernel is the piece of code that executes with the highest privileges and has control over all the software running on a host. A successful attack against an operating system's kernel means a total and complete compromise of the running system. These attacks usually end with the installation of a rootkit, a stealthy piece of software running with kernel privileges. When a rootkit is present, no guarantees can be made about the correctness, privacy or isolation of the operating system. In this paper we present \emph{Hello rootKitty}, an invariance-enforcing framework which takes advantage of current virtualization technology to protect a guest operating system against rootkits. \emph{Hello rootKitty} uses the idea of invariance to detect maliciously modified kernel data structures and restore them to their original legitimate values. Our prototype has negligible performance and memory overhead while effectively protecting commodity operating systems from modern rootkits.

SEMay 22, 2014
HyperForce: Hypervisor-enForced Execution of Security-Critical Code

Francesco Gadaleta, Nick Nikiforakis, Jan Tobias Muhlberg et al.

The sustained popularity of the cloud and cloud-related services accelerate the evolution of virtualization-enabling technologies. Modern off-the-shelf computers are already equipped with specialized hardware that enables a hypervisor to manage the simultaneous execution of multiple operating systems. Researchers have proposed security mechanisms that operate within such a hypervisor to protect the \textit{virtualized} operating systems from attacks. These mechanisms improve in security over previous techniques since the defense system is no longer part of an operating system's attack surface. However, due to constant transitions between the hypervisor and the operating systems, these countermeasures typically incur a significant performance overhead. In this paper we present HyperForce, a framework which allows the deployment of security-critical code in a way that significantly outperforms previous \textit{in-hypervisor} systems while maintaining similar guarantees with respect to security and integrity. HyperForce is a hybrid system which combines the performance of an \textit{in-guest} security mechanism with the security of in-hypervisor one. We evaluate our framework by using it to re-implement an invariance-based rootkit detection system and show the performance benefits of a HyperForce-utilizing countermeasure.