74.6CRMay 28
A Bayesian Approach to Membership Inference for Statistical ReleaseLisa Oakley, Sam Stites, Cameron Moy et al.
The membership inference problem for publicly released statistics from a private dataset is well-studied. When developing and formally analyzing attack strategies, however, the focus has been on attacks that model the population using only its marginals. In practice, these attacks can perform well on various populations, however most formal analysis is for populations that follow a product distribution. These strategies may fail to leverage useful information about the population that is important for understanding a realistic privacy threat. In this work, we explore the impact of providing an attacker with additional information about the attribute dependency structure of the population, motivated by examples where multiple parties may have access to similarly structured data, for example the US Census and the IRS. To model this scenario, we re-frame the membership inference problem with respect to a population represented as a Bayesian network (BN). We develop a framework based on Bayesian decision-making which can incorporate prior information about the population to launch more effective, specialized attacks. To evaluate our framework, we introduce a specific attack instantiation which computes the Bayesian posterior using a probabilistic program, and prove its equivalence to an optimal variant of the likelihood ratio test attack for two populations with strong attribute dependency. We implement our program in the Roulette probabilistic programming language and show experimentally that it outperforms the likelihood ratio test and inner product attacks on five commonly used BNs, where the population dependency structure is too complex for the existing attacks to be manually adapted.
LGOct 22, 2024
Hierarchical Multi-agent Reinforcement Learning for Cyber Network DefenseAditya Vikram Singh, Ethan Rathbun, Emma Graham et al.
Recent advances in multi-agent reinforcement learning (MARL) have created opportunities to solve complex real-world tasks. Cybersecurity is a notable application area, where defending networks against sophisticated adversaries remains a challenging task typically performed by teams of security operators. In this work, we explore novel MARL strategies for building autonomous cyber network defenses that address challenges such as large policy spaces, partial observability, and stealthy, deceptive adversarial strategies. To facilitate efficient and generalized learning, we propose a hierarchical Proximal Policy Optimization (PPO) architecture that decomposes the cyber defense task into specific sub-tasks like network investigation and host recovery. Our approach involves training sub-policies for each sub-task using PPO enhanced with cybersecurity domain expertise. These sub-policies are then leveraged by a master defense policy that coordinates their selection to solve complex network defense tasks. Furthermore, the sub-policies can be fine-tuned and transferred with minimal cost to defend against shifts in adversarial behavior or changes in network settings. We conduct extensive experiments using CybORG Cage 4, the state-of-the-art MARL environment for cyber defense. Comparisons with multiple baselines across different adversaries show that our hierarchical learning approach achieves top performance in terms of convergence speed, episodic return, and several interpretable metrics relevant to cybersecurity, including the fraction of clean machines on the network, precision, and false positives.
CROct 5, 2021
Adversarial Robustness Verification and Attack Synthesis in Stochastic SystemsLisa Oakley, Alina Oprea, Stavros Tripakis
Probabilistic model checking is a useful technique for specifying and verifying properties of stochastic systems including randomized protocols and reinforcement learning models. Existing methods rely on the assumed structure and probabilities of certain system transitions. These assumptions may be incorrect, and may even be violated by an adversary who gains control of system components. In this paper, we develop a formal framework for adversarial robustness in systems modeled as discrete time Markov chains (DTMCs). We base our framework on existing methods for verifying probabilistic temporal logic properties and extend it to include deterministic, memoryless policies acting in Markov decision processes (MDPs). Our framework includes a flexible approach for specifying structure-preserving and non structure-preserving adversarial models. We outline a class of threat models under which adversaries can perturb system transitions, constrained by an $\varepsilon$ ball around the original transition probabilities. We define three main DTMC adversarial robustness problems: adversarial robustness verification, maximal $δ$ synthesis, and worst case attack synthesis. We present two optimization-based solutions to these three problems, leveraging traditional and parametric probabilistic model checking techniques. We then evaluate our solutions on two stochastic protocols and a collection of Grid World case studies, which model an agent acting in an environment described as an MDP. We find that the parametric solution results in fast computation for small parameter spaces. In the case of less restrictive (stronger) adversaries, the number of parameters increases, and directly computing property satisfaction probabilities is more scalable. We demonstrate the usefulness of our definitions and solutions by comparing system outcomes over various properties, threat models, and case studies.
CRJun 27, 2019
QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security GameLisa Oakley, Alina Oprea
A rise in Advanced Persistent Threats (APTs) has introduced a need for robustness against long-running, stealthy attacks which circumvent existing cryptographic security guarantees. FlipIt is a security game that models attacker-defender interactions in advanced scenarios such as APTs. Previous work analyzed extensively non-adaptive strategies in FlipIt, but adaptive strategies rise naturally in practical interactions as players receive feedback during the game. We model the FlipIt game as a Markov Decision Process and introduce QFlip, an adaptive strategy for FlipIt based on temporal difference reinforcement learning. We prove theoretical results on the convergence of our new strategy against an opponent playing with a Periodic strategy. We confirm our analysis experimentally by extensive evaluation of QFlip against specific opponents. QFlip converges to the optimal adaptive strategy for Periodic and Exponential opponents using associated state spaces. Finally, we introduce a generalized QFlip strategy with composite state space that outperforms a Greedy strategy for several distributions including Periodic and Uniform, without prior knowledge of the opponent's strategy. We also release an OpenAI Gym environment for FlipIt to facilitate future research.