Kyle Domico

CR
h-index9
5papers
12citations
Novelty51%
AI Score44

5 Papers

CRMay 8Code
Longitudinal Analyses of SAST Tools: A CodeQL Case Study

Jean-Charles Noirot Ferrand, Kyle Domico, Yohan Beugin et al.

Open-source software (OSS) pipelines rely on automated static analysis tools to prevent the introduction of vulnerabilities in code. However, there is limited understanding of the efficacy of these tools across the OSS ecosystem over time. In this paper, we introduce a novel method to evaluate static application security testing (SAST) tools through longitudinal measurements and perform the largest academic study of CodeQL -- the most prevalent static analysis tool from GitHub -- on OSS codebases. We apply our apparatus on 114 versions of CodeQL over time on 3993 CVEs from 1622 repositories to measure key properties of the tool, culminating in more than 20 billion lines of code analyzed. First, we measure its effectiveness, i.e., its ability to detect vulnerabilities before they are fixed. Then, we determine whether these detections were actionable through two measures of the distance between findings and vulnerability location either over the entire codebase or within the vulnerable file. Finally, we study the stability of CodeQL by examining how vulnerability detections hold across versions and the evolution of CodeQL on the accuracy-precision trade-off. We find that CodeQL identifies a total of 171 CVEs, and that for 83 of them, a CodeQL version prior to the fix could detect it. Such detections are in general actionable if findings are triaged across files, as for 50% of the 171 detections, more than 50% of findings in the vulnerable file are located in the vulnerable location. Finally, we show that CVE detections are not monotonic across versions as 21 CVEs were no longer detected following a version change and 17 that were never redetected. Our study shows that using SAST tools is a matter of best practice as they prevent numerous vulnerabilities from being introduced, but that developers should be aware of changes that may leave blind spots in detections upon updates of the tool.

CRMay 14
The Role of Learning in Attacking ML-based Network Intrusion Detection

Kyle Domico, Jean-Charles Noirot Ferrand, Patrick McDaniel

Machine learning (ML)-based network intrusion detection is susceptible to attacks that perturb malicious network flows to evade detection. Existing approaches to evaluating the robustness of these models rely on gradient-based optimization that are computationally expensive and restricted to differentiable model architectures. This limits their practicality for continuous, large-scale evaluation. To address this, we develop lightweight adversarial agents trained via reinforcement learning (RL) that decouples the cost of learning an evasion strategy from the cost of executing it. These agents learn offline to perturb malicious NetFlow records to evade surrogate intrusion detection models, encoding the resulting strategy into a reusable policy that requires no gradient computation at deployment. We evaluate our approach on four NetFlow datasets spanning enterprise, cloud, and IoT environments against diverse model architectures, including non-differentiable classifiers that gradient-based methods cannot evaluate directly. Agents achieve up to 58.1% attack success at 0.31ms per attack demonstrating up to 1,042X improvement in throughput (attack success per ms) over gradient-based methods. On non-differentiable targets, gradient-based methods lose over 59% of their effectiveness to surrogate transfer, while the RL agent evaluates these models directly at 29.8% attack success. We further conduct a comprehensive transferability study on ML-based intrusion detection, evaluating agent generalization across unseen model architectures and traffic distributions. Our results establish lightweight RL agents as a practical and scalable tool for continuous ML robustness evaluation across diverse network intrusion detection environments.

LGApr 4, 2022
A Machine Learning and Computer Vision Approach to Geomagnetic Storm Forecasting

Kyle Domico, Ryan Sheatsley, Yohan Beugin et al.

Geomagnetic storms, disturbances of Earth's magnetosphere caused by masses of charged particles being emitted from the Sun, are an uncontrollable threat to modern technology. Notably, they have the potential to damage satellites and cause instability in power grids on Earth, among other disasters. They result from high sun activity, which are induced from cool areas on the Sun known as sunspots. Forecasting the storms to prevent disasters requires an understanding of how and when they will occur. However, current prediction methods at the National Oceanic and Atmospheric Administration (NOAA) are limited in that they depend on expensive solar wind spacecraft and a global-scale magnetometer sensor network. In this paper, we introduce a novel machine learning and computer vision approach to accurately forecast geomagnetic storms without the need of such costly physical measurements. Our approach extracts features from images of the Sun to establish correlations between sunspots and geomagnetic storm classification and is competitive with NOAA's predictions. Indeed, our prediction achieves a 76% storm classification accuracy. This paper serves as an existence proof that machine learning and computer vision techniques provide an effective means for augmenting and improving existing geomagnetic storm forecasting methods.

CROct 17, 2023
The Efficacy of Transformer-based Adversarial Attacks in Security Domains

Kunyang Li, Kyle Domico, Jean-Charles Noirot Ferrand et al.

Today, the security of many domains rely on the use of Machine Learning to detect threats, identify vulnerabilities, and safeguard systems from attacks. Recently, transformer architectures have improved the state-of-the-art performance on a wide range of tasks such as malware detection and network intrusion detection. But, before abandoning current approaches to transformers, it is crucial to understand their properties and implications on cybersecurity applications. In this paper, we evaluate the robustness of transformers to adversarial samples for system defenders (i.e., resiliency to adversarial perturbations generated on different types of architectures) and their adversarial strength for system attackers (i.e., transferability of adversarial samples generated by transformers to other target models). To that effect, we first fine-tune a set of pre-trained transformer, Convolutional Neural Network (CNN), and hybrid (an ensemble of transformer and CNN) models to solve different downstream image-based tasks. Then, we use an attack algorithm to craft 19,367 adversarial examples on each model for each task. The transferability of these adversarial examples is measured by evaluating each set on other models to determine which models offer more adversarial strength, and consequently, more robustness against these attacks. We find that the adversarial examples crafted on transformers offer the highest transferability rate (i.e., 25.7% higher than the average) onto other models. Similarly, adversarial examples crafted on other models have the lowest rate of transferability (i.e., 56.7% lower than the average) onto transformers. Our work emphasizes the importance of studying transformer architectures for attacking and defending models in security domains, and suggests using them as the primary architecture in transfer attack settings.

CRMar 3, 2025
Adversarial Agents: Black-Box Evasion Attacks with Reinforcement Learning

Kyle Domico, Jean-Charles Noirot Ferrand, Ryan Sheatsley et al.

Attacks on machine learning models have been extensively studied through stateless optimization. In this paper, we demonstrate how a reinforcement learning (RL) agent can learn a new class of attack algorithms that generate adversarial samples. Unlike traditional adversarial machine learning (AML) methods that craft adversarial samples independently, our RL-based approach retains and exploits past attack experience to improve the effectiveness and efficiency of future attacks. We formulate adversarial sample generation as a Markov Decision Process and evaluate RL's ability to (a) learn effective and efficient attack strategies and (b) compete with state-of-the-art AML. On two image classification benchmarks, our agent increases attack success rate by up to 13.2% and decreases the average number of victim model queries per attack by up to 16.9% from the start to the end of training. In a head-to-head comparison with state-of-the-art image attacks, our approach enables an adversary to generate adversarial samples with 17% more success on unseen inputs post-training. From a security perspective, this work demonstrates a powerful new attack vector that uses RL to train agents that attack ML models efficiently and at scale.