45.9CRApr 7
SoK: Honeypots & LLMs, More Than the Sum of Their Parts?Robert A. Bridges, Thomas R. Mitchell, Mauricio Muñoz et al.
The advent of Large Language Models (LLMs) promised to resolve the long-standing paradox in honeypot design: achieving high-fidelity deception with low operational risk. Since late 2022, a flurry of research has demonstrated steady progress from ideation to prototype implementation. While promising, evaluations show only incremental progress in real-world deployments, and the field still lacks a cohesive understanding of emerging architectural patterns, core challenges, and evaluation paradigms. To fill this gap, we provide the first comprehensive overview and analysis of this new domain, focusing on three critical, intersecting research areas: we provide a taxonomy of honeypot detection vectors, mapped to how LLM-based simulation can or cannot aid deception; we synthesize the emerging literature on LLM-powered honeypots, identifying a canonical architecture, an evaluation tetrad, and an attacker trichotomy mapped to honeypot requirements; and we chart the evolution of honeypot log analysis into automated intelligence generation. Finally, we synthesize these findings into a forward-looking research roadmap, arguing that the true potential of this technology lies in creating autonomous, self-improving deception systems to counter the emerging threat of intelligent, automated attackers.
MLNov 15, 2023
Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?Robert A. Bridges, Vandy J. Tombs, Christopher B. Stanley
The Exponential Mechanism (ExpM), designed for private optimization, has been historically sidelined from use on continuous sample spaces, as it requires sampling from a generally intractable density, and, to a lesser extent, bounding the sensitivity of the objective function. Any differential privacy (DP) mechanism can be instantiated as ExpM, and ExpM poses an elegant solution for private machine learning (ML) that bypasses inherent inefficiencies of DPSGD. This paper seeks to operationalize ExpM for private optimization and ML by using an auxiliary Normalizing Flow (NF), an expressive deep network for density learning, to approximately sample from ExpM density. The method, ExpM+NF is an alternative to SGD methods for model training. We prove a sensitivity bound for the $\ell^2$ loss permitting ExpM use with any sampling method. To test feasibility, we present results on MIMIC-III health data comparing (non-private) SGD, DPSGD, and ExpM+NF training methods' accuracy and training time. We find that a model sampled from ExpM+NF is nearly as accurate as non-private SGD, more accurate than DPSGD, and ExpM+NF trains faster than Opacus' DPSGD implementation. Unable to provide a privacy proof for the NF approximation, we present empirical results to investigate privacy including the LiRA membership inference attack of Carlini et al. and the recent privacy auditing lower bound method of Steinke et al. Our findings suggest ExpM+NF provides more privacy than non-private SGD, but not as much as DPSGD, although many attacks are impotent against any model. Ancillary benefits of this work include pushing the SOTA of privacy and accuracy on MIMIC-III healthcare data, exhibiting the use of ExpM+NF for Bayesian inference, showing the limitations of empirical privacy auditing in practice, and providing several privacy theorems applicable to distribution learning.
CRJun 19, 2024
Evaluating lightweight unsupervised online IDS for masquerade attacks in CANPablo Moriano, Steven C. Hespeler, Mingyan Li et al.
Vehicular controller area networks (CANs) are susceptible to masquerade attacks by malicious adversaries. In masquerade attacks, adversaries silence a targeted ID and then send malicious frames with forged content at the expected timing of benign frames. As masquerade attacks could seriously harm vehicle functionality and are the stealthiest attacks to detect in CAN, recent work has devoted attention to compare frameworks for detecting masquerade attacks in CAN. However, most existing works report offline evaluations using CAN logs already collected using simulations that do not comply with the domain's real-time constraints. Here we contribute to advance the state of the art by presenting a comparative evaluation of four different non-deep learning (DL)-based unsupervised online intrusion detection systems (IDS) for masquerade attacks in CAN. Our approach differs from existing comparative evaluations in that we analyze the effect of controlling streaming data conditions in a sliding window setting. In doing so, we use realistic masquerade attacks being replayed from the ROAD dataset. We show that although evaluated IDS are not effective at detecting every attack type, the method that relies on detecting changes in the hierarchical structure of clusters of time series produces the best results at the expense of higher computational overhead. We discuss limitations, open challenges, and how the evaluated methods can be used for practical unsupervised online CAN IDS for masquerade attacks.
CRJan 20, 2022
Assembling a Cyber Range to Evaluate Artificial Intelligence / Machine Learning (AI/ML) Security ToolsJeffrey A. Nichols, Kevin D. Spakes, Cory L. Watson et al.
In this case study, we describe the design and assembly of a cyber security testbed at Oak Ridge National Laboratory in Oak Ridge, TN, USA. The range is designed to provide agile reconfigurations to facilitate a wide variety of experiments for evaluations of cyber security tools -- particularly those involving AI/ML. In particular, the testbed provides realistic test environments while permitting control and programmatic observations/data collection during the experiments. We have designed in the ability to repeat the evaluations, so additional tools can be evaluated and compared at a later time. The system is one that can be scaled up or down for experiment sizes. At the time of the conference we will have completed two full-scale, national, government challenges on this range. These challenges are evaluating the performance and operating costs for AI/ML-based cyber security tools for application into large, government-sized networks. These evaluations will be described as examples providing motivation and context for various design decisions and adaptations we have made. The first challenge measured end-point security tools against 100K file samples (benignware and malware) chosen across a range of file types. The second is an evaluation of network intrusion detection systems efficacy in identifying multi-step adversarial campaigns -- involving reconnaissance, penetration and exploitations, lateral movement, etc. -- with varying levels of covertness in a high-volume business network. The scale of each of these challenges requires automation systems to repeat, or simultaneously mirror identical the experiments for each ML tool under test. Providing an array of easy-to-difficult malicious activity for sussing out the true abilities of the AI/ML tools has been a particularly interesting and challenging aspect of designing and executing these challenge events.
CRJan 7, 2022
Detecting CAN Masquerade Attacks with Signal Clustering SimilarityPablo Moriano, Robert A. Bridges, Michael D. Iannacone
Vehicular Controller Area Networks (CANs) are susceptible to cyber attacks of different levels of sophistication. Fabrication attacks are the easiest to administer -- an adversary simply sends (extra) frames on a CAN -- but also the easiest to detect because they disrupt frame frequency. To overcome time-based detection methods, adversaries must administer masquerade attacks by sending frames in lieu of (and therefore at the expected time of) benign frames but with malicious payloads. Research efforts have proven that CAN attacks, and masquerade attacks in particular, can affect vehicle functionality. Examples include causing unintended acceleration, deactivation of vehicle's brakes, as well as steering the vehicle. We hypothesize that masquerade attacks modify the nuanced correlations of CAN signal time series and how they cluster together. Therefore, changes in cluster assignments should indicate anomalous behavior. We confirm this hypothesis by leveraging our previously developed capability for reverse engineering CAN signals (i.e., CAN-D [Controller Area Network Decoder]) and focus on advancing the state of the art for detecting masquerade attacks by analyzing time series extracted from raw CAN frames. Specifically, we demonstrate that masquerade attacks can be detected by computing time series clustering similarity using hierarchical clustering on the vehicle's CAN signals (time series) and comparing the clustering similarity across CAN captures with and without attacks. We test our approach in a previously collected CAN dataset with masquerade attacks (i.e., the ROAD dataset) and develop a forensic tool as a proof of concept to demonstrate the potential of the proposed approach for detecting CAN masquerade attacks.
CRMay 13, 2021
What Clinical Trials Can Teach Us about the Development of More Resilient AI for CybersecurityEdmon Begoli, Robert A. Bridges, Sean Oesch et al.
Policy-mandated, rigorously administered scientific testing is needed to provide transparency into the efficacy of artificial intelligence-based (AI-based) cyber defense tools for consumers and to prioritize future research and development. In this article, we propose a model that is informed by our experience, urged forward by massive scale cyberattacks, and inspired by parallel developments in the biomedical field and the unprecedentedly fast development of new vaccines to combat global pathogens.
CRJan 14, 2021
Time-Based CAN Intrusion Detection BenchmarkDeborah H. Blevins, Pablo Moriano, Robert A. Bridges et al.
Modern vehicles are complex cyber-physical systems made of hundreds of electronic control units (ECUs) that communicate over controller area networks (CANs). This inherited complexity has expanded the CAN attack surface which is vulnerable to message injection attacks. These injections change the overall timing characteristics of messages on the bus, and thus, to detect these malicious messages, time-based intrusion detection systems (IDSs) have been proposed. However, time-based IDSs are usually trained and tested on low-fidelity datasets with unrealistic, labeled attacks. This makes difficult the task of evaluating, comparing, and validating IDSs. Here we detail and benchmark four time-based IDSs against the newly published ROAD dataset, the first open CAN IDS dataset with real (non-simulated) stealthy attacks with physically verified effects. We found that methods that perform hypothesis testing by explicitly estimating message timing distributions have lower performance than methods that seek anomalies in a distribution-related statistic. In particular, these "distribution-agnostic" based methods outperform "distribution-based" methods by at least 55% in area under the precision-recall curve (AUC-PR). Our results expand the body of knowledge of CAN time-based IDSs by providing details of these methods and reporting their results when tested on datasets with real advanced attacks. Finally, we develop an after-market plug-in detector using lightweight hardware, which can be used to deploy the best performing IDS method on nearly any vehicle.
CRDec 29, 2020
A Comprehensive Guide to CAN IDS Data & Introduction of the ROAD DatasetMiki E. Verma, Robert A. Bridges, Michael D. Iannacone et al.
Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions on CANs. Producing vehicular CAN data with a variety of intrusions is out of reach for most researchers as it requires expensive assets and expertise. To assist researchers, we present the first comprehensive guide to the existing open CAN intrusion datasets, including a quality analysis of each dataset and an enumeration of each's benefits, drawbacks, and suggested use case. Current public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, which lack fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but not a corresponding raw binary version. Overall, the available data pigeon-holes CAN IDS works into testing on limited, often inappropriate data (usually with attacks that are too easily detectable to truly test the method), and this lack data has stymied comparability and reproducibility of results. As our primary contribution, we present the ROAD (Real ORNL Automotive Dynamometer) CAN Intrusion Dataset, consisting of over 3.5 hours of one vehicle's CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real fuzzing, fabrication, and unique advanced attacks, as well as simulated masquerade attacks. To facilitate benchmarking CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS field.
CRDec 16, 2020
Beyond the Hype: A Real-World Evaluation of the Impact and Cost of Machine Learning-Based Malware DetectionRobert A. Bridges, Sean Oesch, Miki E. Verma et al.
In this paper, we present a scientific evaluation of four prominent malware detection tools to assist an organization with two primary questions: To what extent do ML-based tools accurately classify previously- and never-before-seen files? Is it worth purchasing a network-level malware detector? To identify weaknesses, we tested each tool against 3,536 total files (2,554 or 72\% malicious, 982 or 28\% benign) of a variety of file types, including hundreds of malicious zero-days, polyglots, and APT-style files, delivered on multiple protocols. We present statistical results on detection time and accuracy, consider complementary analysis (using multiple tools together), and provide two novel applications of the recent cost-benefit evaluation procedure of Iannacone \& Bridges. While the ML-based tools are more effective at detecting zero-day files and executables, the signature-based tool may still be an overall better option. Both network-based tools provide substantial (simulated) savings when paired with either host tool, yet both show poor detection rates on protocols other than HTTP or SMTP. Our results show that all four tools have near-perfect precision but alarmingly low recall, especially on file types other than executables and office files -- 37% of malware tested, including all polyglot files, were undetected. Priorities for researchers and takeaways for end users are given.
CROct 15, 2019
Automated Ransomware Behavior Analysis: Pattern Extraction and Early DetectionQian Chen, Sheikh Rabiul Islam, Henry Haswell et al.
Security operation centers (SOCs) typically use a variety of tools to collect large volumes of host logs for detection and forensic of intrusions. Our experience, supported by recent user studies on SOC operators, indicates that operators spend ample time (e.g., hundreds of man-hours) on investigations into logs seeking adversarial actions. Similarly, reconfiguration of tools to adapt detectors for future similar attacks is commonplace upon gaining novel insights (e.g., through internal investigation or shared indicators). This paper presents an automated malware pattern-extraction and early detection tool, testing three machine learning approaches: TF-IDF (term frequency-inverse document frequency), Fisher's LDA (linear discriminant analysis) and ET (extra trees/extremely randomized trees) that can (1) analyze freshly discovered malware samples in sandboxes and generate dynamic analysis reports (host logs); (2) automatically extract the sequence of events induced by malware given a large volume of ambient (un-attacked) host logs, and the relatively few logs from hosts that are infected with potentially polymorphic malware; (3) rank the most discriminating features (unique patterns) of malware and from the learned behavior detect malicious activity; and (4) allows operators to visualize the discriminating features and their correlations to facilitate malware forensic efforts. To validate the accuracy and efficiency of our tool, we design three experiments and test seven ransomware attacks (i.e., WannaCry, DBGer, Cerber, Defray, GandCrab, Locky, and nRansom). The experimental results show that TF-IDF is the best of the three methods to identify discriminating features, and ET is the most time-efficient and robust approach.
MLApr 30, 2019
Active Manifolds: A non-linear analogue to Active SubspacesRobert A. Bridges, Anthony D. Gruber, Christopher Felder et al.
We present an approach to analyze $C^1(\mathbb{R}^m)$ functions that addresses limitations present in the Active Subspaces (AS) method of Constantine et al.(2015; 2014). Under appropriate hypotheses, our Active Manifolds (AM) method identifies a 1-D curve in the domain (the active manifold) on which nearly all values of the unknown function are attained, and which can be exploited for approximation or analysis, especially when $m$ is large (high-dimensional input space). We provide theorems justifying our AM technique and an algorithm permitting functional approximation and sensitivity analysis. Using accessible, low-dimensional functions as initial examples, we show AM reduces approximation error by an order of magnitude compared to AS, at the expense of more computation. Following this, we revisit the sensitivity analysis by Glaws et al. (2017), who apply AS to analyze a magnetohydrodynamic power generator model, and compare the performance of AM on the same data. Our analysis provides detailed information not captured by AS, exhibiting the influence of each parameter individually along an active manifold. Overall, AM represents a novel technique for analyzing functional models with benefits including: reducing $m$-dimensional analysis to a 1-D analogue, permitting more accurate regression than AS (at more computational expense), enabling more informative sensitivity analysis, and granting accessible visualizations(2-D plots) of parameter sensitivity along the AM.
CRJan 31, 2019
Quantifiable & Comparable Evaluations of Cyber Defensive Capabilities: A Survey & Novel, Unified ApproachMichael D. Iannacone, Robert A. Bridges
Metrics and frameworks to quantifiably assess security measures have arisen from needs of three distinct research communities - statistical measures from the intrusion detection and prevention literature, evaluation of cyber exercises, e.g.,red-team and capture-the-flag competitions, and economic analyses addressing cost-versus-security tradeoffs. In this paper we provide two primary contributions to the security evaluation literature - a representative survey, and a novel framework for evaluating security that is flexible, applicable to all three use cases, and readily interpretable. In our survey of the literature we identify the distinct themes from each community's evaluation procedures side by side and flesh out the drawbacks and benefits of each. The evaluation framework we propose includes comprehensively modeling the resource, labor, and attack costs in dollars incurred based on expected resource usage, accuracy metrics, and time. This framework provides a unified approach in that it incorporates the accuracy and performance metrics, which dominate intrusion detection evaluation, the time to detection and impact to data and resources of an attack, favored by educational competitions' metrics, and the monetary cost of many essential security components used in financial analysis. Moreover, it is flexible enough to accommodate each use case, easily interpretable and comparable, and comprehensive in terms of costs considered.Finally, we provide two examples of the framework applied to real-world use cases. Overall, we provide a survey and a grounded, flexible framework with multiple concrete examples for evaluating security which can address the needs of three currently distinct communities.
CRDec 30, 2018
Towards a CAN IDS based on a neural-network data field predictorKrzysztof Pawelec, Robert A. Bridges, Frank L. Combs
Modern vehicles contain a few controller area networks (CANs), which allow scores of on-board electronic control units (ECUs) to communicate messages critical to vehicle functions and driver safety. CAN provide a lightweight and reliable broadcast protocol but is bereft of security features. As evidenced by many recent research works, CAN exploits are possible both remotely and with direct access, fueling a growing CAN intrusion detection system (IDS) body of research. A challenge for pioneering vehicle-agnostic IDSs is that passenger vehicles' CAN message encodings are proprietary, defined and held secret by original equipment manufacturers (OEMs). Targeting detection of next-generation attacks, in which messages are sent from the expected ECU at the expected time but with malicious content, researchers are now seeking to leverage "CAN data models", which predict future CAN message contents and use prediction error to identify anomalous, hopefully malicious CAN messages. Yet, current works model CAN signals post-translation, i.e., after applying OEM-donated or reverse-engineered translations from raw data. In this work, we present initial IDS results testing deep neural networks used to predict CAN data at the bit level, thereby providing IDS capabilities but avoiding reverse engineering proprietary encodings. Our results suggest the method is promising for continuous signals in CAN data, but struggles for discrete, e.g., binary, signals.
HCDec 7, 2018
How do information security workers use host data? A summary of interviews with security analystsRobert A. Bridges, Michael D. Iannacone, John R. Goodall et al.
Modern security operations centers (SOCs) employ a variety of tools for intrusion detection, prevention, and widespread log aggregation and analysis. While research efforts are quickly proposing novel algorithms and technologies for cyber security, access to actual security personnel, their data, and their problems are necessarily limited by security concerns and time constraints. To help bridge the gap between researchers and security centers, this paper reports results of semi-structured interviews of 13 professionals from five different SOCs including at least one large academic, research, and government organization. The interviews focused on the current practices and future desires of SOC operators about host-based data collection capabilities, what is learned from the data, what tools are used, and how tools are evaluated. Questions and the responses are organized and reported by topic. Then broader themes are discussed. Forest-level takeaways from the interviews center on problems stemming from size of data, correlation of heterogeneous but related data sources, signal-to-noise ratio of data, and analysts' time.
APNov 1, 2018
Defining a Metric Space of Host Logs and Operational Use CasesMiki E. Verma, Robert A. Bridges
Host logs, in particular, Windows Event Logs, are a valuable source of information often collected by security operation centers (SOCs). The semi-structured nature of host logs inhibits automated analytics, and while manual analysis is common, the sheer volume makes manual inspection of all logs impossible. Although many powerful algorithms for analyzing time-series and sequential data exist, utilization of such algorithms for most cyber security applications is either infeasible or requires tailored, research-intensive preparations. In particular, basic mathematic and algorithmic developments for providing a generalized, meaningful similarity metric on system logs is needed to bridge the gap between many existing sequential data mining methods and this currently available but under-utilized data source. In this paper, we provide a rigorous definition of a metric product space on Windows Event Logs, providing an embedding that allows for the application of established machine learning and time-series analysis methods. We then demonstrate the utility and flexibility of this embedding with multiple use-cases on real data: (1) comparing known infected to new host log streams for attack detection and forensics, (2) collapsing similar streams of logs into semantically-meaningful groups (by user, by role), thereby reducing the quantity of data but not the content, (3) clustering logs as well as short sequences of logs to identify and visualize user behaviors and background processes over time. Overall, we provide a metric space framework for general host logs and log sequences that respects semantic similarity and facilitates a wide variety of data science analytics to these logs without data-specific preparations for each.
CRAug 28, 2018
Exploiting the Shape of CAN Data for In-Vehicle Intrusion DetectionZachariah Tyree, Robert A. Bridges, Frank L. Combs et al.
Modern vehicles rely on scores of electronic control units (ECUs) broadcasting messages over a few controller area networks (CANs). Bereft of security features, in-vehicle CANs are exposed to cyber manipulation and multiple researches have proved viable, life-threatening cyber attacks. Complicating the issue, CAN messages lack a common mapping of functions to commands, so packets are observable but not easily decipherable. We present a transformational approach to CAN IDS that exploits the geometric properties of CAN data to inform two novel detectors--one based on distance from a learned, lower dimensional manifold and the other on discontinuities of the manifold over time. Proof-of-concept tests are presented by implementing a potential attack approach on a driving vehicle. The initial results suggest that (1) the first detector requires additional refinement but does hold promise; (2) the second detector gives a clear, strong indicator of the attack; and (3) the algorithms keep pace with high-speed CAN messages. As our approach is data-driven it provides a vehicle-agnostic IDS that eliminates the need to reverse engineer CAN messages and can be ported to an after-market plugin.
CRMay 24, 2018
Forming IDEAS Interactive Data Exploration & Analysis SystemRobert A. Bridges, Maria A. Vincent, Kelly M. T. Huffer et al.
Modern cyber security operations collect an enormous amount of logging and alerting data. While analysts have the ability to query and compute simple statistics and plots from their data, current analytical tools are too simple to admit deep understanding. To detect advanced and novel attacks, analysts turn to manual investigations. While commonplace, current investigations are time-consuming, intuition-based, and proving insufficient. Our hypothesis is that arming the analyst with easy-to-use data science tools will increase their work efficiency, provide them with the ability to resolve hypotheses with scientific inquiry of their data, and support their decisions with evidence over intuition. To this end, we present our work to build IDEAS (Interactive Data Exploration and Analysis System). We present three real-world use-cases that drive the system design from the algorithmic capabilities to the user interface. Finally, a modular and scalable software architecture is discussed along with plans for our pilot deployment with a security operation command.
CRMay 16, 2018
A Survey of Intrusion Detection Systems Leveraging Host DataTarrah R. Glass-Vanderlan, Michael D. Iannacone, Maria S. Vincent et al.
This survey focuses on intrusion detection systems (IDS) that leverage host-based data sources for detecting attacks on enterprise network. The host-based IDS (HIDS) literature is organized by the input data source, presenting targeted sub-surveys of HIDS research leveraging system logs, audit data, Windows Registry, file systems, and program analysis. While system calls are generally included in audit data, several publicly available system call datasets have spawned a flurry of IDS research on this topic, which merits a separate section. Similarly, a section surveying algorithmic developments that are applicable to HIDS but tested on network data sets is included, as this is a large and growing area of applicable literature. To accommodate current researchers, a supplementary section giving descriptions of publicly available datasets is included, outlining their characteristics and shortcomings when used for IDS evaluation. Related surveys are organized and described. All sections are accompanied by tables concisely organizing the literature and datasets discussed. Finally, challenges, trends, and broader observations are throughout the survey and in the conclusion along with future directions of IDS research.
LGFeb 7, 2018
Dimension Reduction Using Active ManifoldsRobert A. Bridges, Chris Felder, Chelsey Hoff
Scientists and engineers rely on accurate mathematical models to quantify the objects of their studies, which are often high-dimensional. Unfortunately, high-dimensional models are inherently difficult, i.e. when observations are sparse or expensive to determine. One way to address this problem is to approximate the original model with fewer input dimensions. Our project goal was to recover a function f that takes n inputs and returns one output, where n is potentially large. For any given n-tuple, we assume that we can observe a sample of the gradient and output of the function but it is computationally expensive to do so. This project was inspired by an approach known as Active Subspaces, which works by linearly projecting to a linear subspace where the function changes most on average. Our research gives mathematical developments informing a novel algorithm for this problem. Our approach, Active Manifolds, increases accuracy by seeking nonlinear analogues that approximate the function. The benefits of our approach are eliminated unprincipled parameter, choices, guaranteed accessible visualization, and improved estimation accuracy.
CROct 25, 2017
Setting the threshold for high throughput detectors: A mathematical approach for ensembles of dynamic, heterogeneous, probabilistic anomaly detectorsRobert A. Bridges, Jessie D. Jamieson, Joel W. Reed
Anomaly detection (AD) has garnered ample attention in security research, as such algorithms complement existing signature-based methods but promise detection of never-before-seen attacks. Cyber operations manage a high volume of heterogeneous log data; hence, AD in such operations involves multiple (e.g., per IP, per data type) ensembles of detectors modeling heterogeneous characteristics (e.g., rate, size, type) often with adaptive online models producing alerts in near real time. Because of high data volume, setting the threshold for each detector in such a system is an essential yet underdeveloped configuration issue that, if slightly mistuned, can leave the system useless, either producing a myriad of alerts and flooding downstream systems, or giving none. In this work, we build on the foundations of Ferragut et al. to provide a set of rigorous results for understanding the relationship between threshold values and alert quantities, and we propose an algorithm for setting the threshold in practice. Specifically, we give an algorithm for setting the threshold of multiple, heterogeneous, possibly dynamic detectors completely a priori, in principle. Indeed, if the underlying distribution of the incoming data is known (closely estimated), the algorithm provides provably manageable thresholds. If the distribution is unknown (e.g., has changed over time) our analysis reveals how the model distribution differs from the actual distribution, indicating a period of model refitting is necessary. We provide empirical experiments showing the efficacy of the capability by regulating the alert rate of a system with $\approx$2,500 adaptive detectors scoring over 1.5M events in 5 hours. Further, we demonstrate on the real network data and detection framework of Harshaw et al. the alternative case, showing how the inability to regulate alerts indicates the detection model is a bad fit to the data.
CRSep 25, 2017
Automated Behavioral Analysis of Malware A Case Study of WannaCry RansomwareQian Chen, Robert A. Bridges
Ransomware, a class of self-propagating malware that uses encryption to hold the victims' data ransom, has emerged in recent years as one of the most dangerous cyber threats, with widespread damage; e.g., zero-day ransomware WannaCry has caused world-wide catastrophe, from knocking U.K. National Health Service hospitals offline to shutting down a Honda Motor Company in Japan[1]. Our close collaboration with security operations of large enterprises reveals that defense against ransomware relies on tedious analysis from high-volume systems logs of the first few infections. Sandbox analysis of freshly captured malware is also commonplace in operation. We introduce a method to identify and rank the most discriminating ransomware features from a set of ambient (non-attack) system logs and at least one log stream containing both ambient and ransomware behavior. These ranked features reveal a set of malware actions that are produced automatically from system logs, and can help automate tedious manual analysis. We test our approach using WannaCry and two polymorphic samples by producing logs with Cuckoo Sandbox during both ambient, and ambient plus ransomware executions. Our goal is to extract the features of the malware from the logs with only knowledge that malware was present. We compare outputs with a detailed analysis of WannaCry allowing validation of the algorithm's feature extraction and provide analysis of the method's robustness to variations of input data\textemdash changing quality/quantity of ambient data and testing polymorphic ransomware. Most notably, our patterns are accurate and unwavering when generated from polymorphic WannaCry copies, on which 63 (of 63 tested) anti-virus (AV) products fail.
CRMay 4, 2017
Malware Detection on General-Purpose Computers Using Power Consumption Monitoring: A Proof of Concept and Case StudyJarilyn M. Hernández Jiménez, Jeffrey A. Nichols, Katerina Goseva-Popstojanova et al.
Malware detection is challenging when faced with automatically generated and polymorphic malware, as well as with rootkits, which are exceptionally hard to detect. In an attempt to contribute towards addressing these challenges, we conducted a proof of concept study that explored the use of power consumption for detection of malware presence in a general-purpose computer. The results of our experiments indicate that malware indeed leaves a signal on the power consumption of a general-purpose computer. Specifically, for the case study based on two different rootkits, the data collected at the +12V rails on the motherboard showed the most noticeable increment of the power consumption after the computer was infected. Our future work includes experimenting with more malware examples and workloads, and developing data analytics approach for automatic malware detection based on power consumption.
CRFeb 2, 2016
GraphPrints: Towards a Graph Analytic Method for Network Anomaly DetectionChristopher R. Harshaw, Robert A. Bridges, Michael D. Iannacone et al.
This paper introduces a novel graph-analytic approach for detecting anomalies in network flow data called GraphPrints. Building on foundational network-mining techniques, our method represents time slices of traffic as a graph, then counts graphlets -- small induced subgraphs that describe local topology. By performing outlier detection on the sequence of graphlet counts, anomalous intervals of traffic are identified, and furthermore, individual IPs experiencing abnormal behavior are singled-out. Initial testing of GraphPrints is performed on real network data with an implanted anomaly. Evaluation shows false positive rates bounded by 2.84% at the time-interval level, and 0.05% at the IP-level with 100% true positive rates at both.
IRApr 16, 2015
Towards a relation extraction framework for cyber-security conceptsCorinne L. Jones, Robert A. Bridges, Kelly Huffer et al.
In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting from the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.
SIOct 16, 2014
Multi-Level Anomaly Detection on Time-Varying Graph DataRobert A. Bridges, John Collins, Erik M. Ferragut et al.
This work presents a novel modeling and analysis framework for graph sequences which addresses the challenge of detecting and contextualizing anomalies in labelled, streaming graph data. We introduce a generalization of the BTER model of Seshadhri et al. by adding flexibility to community structure, and use this model to perform multi-scale graph anomaly detection. Specifically, probability models describing coarse subgraphs are built by aggregating probabilities at finer levels, and these closely related hierarchical models simultaneously detect deviations from expectation. This technique provides insight into a graph's structure and internal context that may shed light on a detected event. Additionally, this multi-scale analysis facilitates intuitive visualizations by allowing users to narrow focus from an anomalous graph to particular subgraphs or nodes causing the anomaly. For evaluation, two hierarchical anomaly detectors are tested against a baseline Gaussian method on a series of sampled graphs. We demonstrate that our graph statistics-based approach outperforms both a distribution-based detector and the baseline in a labeled setting with community structure, and it accurately detects anomalies in synthetic and real-world datasets at the node, subgraph, and graph levels. To illustrate the accessibility of information made possible via this technique, the anomaly detector and an associated interactive visualization tool are tested on NCAA football data, where teams and conferences that moved within the league are identified with perfect recall, and precision greater than 0.786.
IRAug 22, 2013
Automatic Labeling for Entity Extraction in Cyber SecurityRobert A. Bridges, Corinne L. Jones, Michael D. Iannacone et al.
Timely analysis of cyber-security information necessitates automated information extraction from unstructured text. While state-of-the-art extraction methods produce extremely accurate results, they require ample training data, which is generally unavailable for specialized applications, such as detecting security related entities; moreover, manual annotation of corpora is very costly and often not a viable solution. In response, we develop a very precise method to automatically label text from several data sources by leveraging related, domain-specific, structured data and provide public access to a corpus annotated with cyber-security entities. Next, we implement a Maximum Entropy Model trained with the average perceptron on a portion of our corpus ($\sim$750,000 words) and achieve near perfect precision, recall, and accuracy, with training times under 17 seconds.
IRAug 21, 2013
PACE: Pattern Accurate Computationally Efficient Bootstrapping for Timely Discovery of Cyber-Security ConceptsNikki McNeil, Robert A. Bridges, Michael D. Iannacone et al.
Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources months before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.