Charles Kamhoua

CR
h-index12
16papers
386citations
Novelty47%
AI Score40

16 Papers

LGMay 25, 2022
MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning

Stephanie Milani, Zhicheng Zhang, Nicholay Topin et al.

Many recent breakthroughs in multi-agent reinforcement learning (MARL) require the use of deep neural networks, which are challenging for human experts to interpret and understand. On the other hand, existing work on interpretable reinforcement learning (RL) has shown promise in extracting more interpretable decision tree-based policies from neural networks, but only in the single-agent setting. To fill this gap, we propose the first set of algorithms that extract interpretable decision-tree policies from neural networks trained with MARL. The first algorithm, IVIPER, extends VIPER, a recent method for single-agent interpretable RL, to the multi-agent setting. We demonstrate that IVIPER learns high-quality decision-tree policies for each agent. To better capture coordination between agents, we propose a novel centralized decision-tree training algorithm, MAVIPER. MAVIPER jointly grows the trees of each agent by predicting the behavior of the other agents using their anticipated trees, and uses resampling to focus on states that are critical for its interactions with other agents. We show that both algorithms generally outperform the baselines and that MAVIPER-trained agents achieve better-coordinated performance than IVIPER-trained agents on three different multi-agent particle-world environments.

CRMar 22, 2023
AIIPot: Adaptive Intelligent-Interaction Honeypot for IoT Devices

Volviane Saphir Mfogo, Alain Zemkoho, Laurent Njilla et al.

The proliferation of the Internet of Things (IoT) has raised concerns about the security of connected devices. There is a need to develop suitable and cost-efficient methods to identify vulnerabilities in IoT devices in order to address them before attackers seize opportunities to compromise them. The deception technique is a prominent approach to improving the security posture of IoT systems. Honeypot is a popular deception technique that mimics interaction in real fashion and encourages unauthorised users (attackers) to launch attacks. Due to the large number and the heterogeneity of IoT devices, manually crafting the low and high-interaction honeypots is not affordable. This has forced researchers to seek innovative ways to build honeypots for IoT devices. In this paper, we propose a honeypot for IoT devices that uses machine learning techniques to learn and interact with attackers automatically. The evaluation of the proposed model indicates that our system can improve the session length with attackers and capture more attacks on the IoT network.

36.2CRMar 21
Cyber Deception for Mission Surveillance via Hypergame-Theoretic Deep Reinforcement Learning

Zelin Wan, Jin-Hee Cho, Mu Zhu et al.

Unmanned Aerial Vehicles (UAVs) are valuable for mission-critical systems like surveillance, rescue, or delivery. Not surprisingly, such systems attract cyberattacks, including Denial-of-Service (DoS) attacks to overwhelm the resources of mission drones (MDs). How can we defend UAV mission systems against DoS attacks? We adopt cyber deception as a defense strategy, in which honey drones (HDs) are proposed to bait and divert attacks. The attack and deceptive defense hinge upon radio signal strength: The attacker selects victim MDs based on their signals, and HDs attract the attacker from afar by emitting stronger signals, despite this reducing battery life. We formulate an optimization problem for the attacker and defender to identify their respective strategies for maximizing mission performance while minimizing energy consumption. To address this problem, we propose a novel approach, called HT-DRL. HT-DRL identifies optimal solutions without a long learning convergence time by taking the solutions of hypergame theory into the neural network of deep reinforcement learning. This achieves a systematic way to intelligently deceive attackers. We analyze the performance of diverse defense mechanisms under different attack strategies. Further, the HT-DRL-based HD approach outperforms existing non-HD counterparts up to two times better in mission performance while incurring low energy consumption.

LGFeb 8, 2024
Decision Theory-Guided Deep Reinforcement Learning for Fast Learning

Zelin Wan, Jin-Hee Cho, Mu Zhu et al.

This paper introduces a novel approach, Decision Theory-guided Deep Reinforcement Learning (DT-guided DRL), to address the inherent cold start problem in DRL. By integrating decision theory principles, DT-guided DRL enhances agents' initial performance and robustness in complex environments, enabling more efficient and reliable convergence during learning. Our investigation encompasses two primary problem contexts: the cart pole and maze navigation challenges. Experimental results demonstrate that the integration of decision theory not only facilitates effective initial guidance for DRL agents but also promotes a more structured and informed exploration strategy, particularly in environments characterized by large and intricate state spaces. The results of experiment demonstrate that DT-guided DRL can provide significantly higher rewards compared to regular DRL. Specifically, during the initial phase of training, the DT-guided DRL yields up to an 184% increase in accumulated reward. Moreover, even after reaching convergence, it maintains a superior performance, ending with up to 53% more reward than standard DRL in large maze problems. DT-guided DRL represents an advancement in mitigating a fundamental challenge of DRL by leveraging functions informed by human (designer) knowledge, setting a foundation for further research in this promising interdisciplinary domain.

CRMay 1, 2023
IoTFlowGenerator: Crafting Synthetic IoT Device Traffic Flows for Cyber Deception

Joseph Bao, Murat Kantarcioglu, Yevgeniy Vorobeychik et al.

Over the years, honeypots emerged as an important security tool to understand attacker intent and deceive attackers to spend time and resources. Recently, honeypots are being deployed for Internet of things (IoT) devices to lure attackers, and learn their behavior. However, most of the existing IoT honeypots, even the high interaction ones, are easily detected by an attacker who can observe honeypot traffic due to lack of real network traffic originating from the honeypot. This implies that, to build better honeypots and enhance cyber deception capabilities, IoT honeypots need to generate realistic network traffic flows. To achieve this goal, we propose a novel deep learning based approach for generating traffic flows that mimic real network traffic due to user and IoT device interactions. A key technical challenge that our approach overcomes is scarcity of device-specific IoT traffic data to effectively train a generator. We address this challenge by leveraging a core generative adversarial learning algorithm for sequences along with domain specific knowledge common to IoT devices. Through an extensive experimental evaluation with 18 IoT devices, we demonstrate that the proposed synthetic IoT traffic generation tool significantly outperforms state of the art sequence and packet generators in remaining indistinguishable from real traffic even to an adaptive attacker.

GTSep 23, 2021
Learning Generative Deception Strategies in Combinatorial Masking Games

Junlin Wu, Charles Kamhoua, Murat Kantarcioglu et al.

Deception is a crucial tool in the cyberdefence repertoire, enabling defenders to leverage their informational advantage to reduce the likelihood of successful attacks. One way deception can be employed is through obscuring, or masking, some of the information about how systems are configured, increasing attacker's uncertainty about their targets. We present a novel game-theoretic model of the resulting defender-attacker interaction, where the defender chooses a subset of attributes to mask, while the attacker responds by choosing an exploit to execute. The strategies of both players have combinatorial structure with complex informational dependencies, and therefore even representing these strategies is not trivial. First, we show that the problem of computing an equilibrium of the resulting zero-sum defender-attacker game can be represented as a linear program with a combinatorial number of system configuration variables and constraints, and develop a constraint generation approach for solving this problem. Next, we present a novel highly scalable approach for approximately solving such games by representing the strategies of both players as neural networks. The key idea is to represent the defender's mixed strategy using a deep neural network generator, and then using alternating gradient-descent-ascent algorithm, analogous to the training of Generative Adversarial Networks. Our experiments, as well as a case study, demonstrate the efficacy of the proposed approach.

LGJun 30, 2021
Understanding Adversarial Examples Through Deep Neural Network's Response Surface and Uncertainty Regions

Juan Shu, Bowei Xi, Charles Kamhoua

Deep neural network (DNN) is a popular model implemented in many systems to handle complex tasks such as image classification, object recognition, natural language processing etc. Consequently DNN structural vulnerabilities become part of the security vulnerabilities in those systems. In this paper we study the root cause of DNN adversarial examples. We examine the DNN response surface to understand its classification boundary. Our study reveals the structural problem of DNN classification boundary that leads to the adversarial examples. Existing attack algorithms can generate from a handful to a few hundred adversarial examples given one clean image. We show there are infinitely many adversarial images given one clean sample, all within a small neighborhood of the clean sample. We then define DNN uncertainty regions and show transferability of adversarial examples is not universal. We also argue that generalization error, the large sample theoretical guarantee established for DNN, cannot adequately capture the phenomenon of adversarial examples. We need new theory to measure DNN robustness.

LGJan 22, 2021
Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions

Todd Huster, Jeremy E. J. Cohen, Zinan Lin et al.

Generative adversarial networks (GANs) are often billed as "universal distribution learners", but precisely what distributions they can represent and learn is still an open question. Heavy-tailed distributions are prevalent in many different domains such as financial risk-assessment, physics, and epidemiology. We observe that existing GAN architectures do a poor job of matching the asymptotic behavior of heavy-tailed distributions, a problem that we show stems from their construction. Additionally, when faced with the infinite moments and large distances between outlier points that are characteristic of heavy-tailed distributions, common loss functions produce unstable or near-zero gradients. We address these problems with the Pareto GAN. A Pareto GAN leverages extreme value theory and the functional properties of neural networks to learn a distribution that matches the asymptotic behavior of the marginal distributions of the features. We identify issues with standard loss functions and propose the use of alternative metric spaces that enable stable and efficient learning. Finally, we evaluate our proposed approach on a variety of heavy-tailed datasets.

CRJan 21, 2021
Game-Theoretic and Machine Learning-based Approaches for Defensive Deception: A Survey

Mu Zhu, Ahmed H. Anwar, Zelin Wan et al.

Defensive deception is a promising approach for cyber defense. Via defensive deception, the defender can anticipate attacker actions; it can mislead or lure attacker, or hide real resources. Although defensive deception is increasingly popular in the research community, there has not been a systematic investigation of its key components, the underlying principles, and its tradeoffs in various problem settings. This survey paper focuses on defensive deception research centered on game theory and machine learning, since these are prominent families of artificial intelligence approaches that are widely employed in defensive deception. This paper brings forth insights, lessons, and limitations from prior work. It closes with an outline of some research directions to tackle major gaps in current defensive deception research.

AIMay 13, 2019
Learning and Planning in the Feature Deception Problem

Zheyuan Ryan Shi, Ariel D. Procaccia, Kevin S. Chan et al.

Today's high-stakes adversarial interactions feature attackers who constantly breach the ever-improving security measures. Deception mitigates the defender's loss by misleading the attacker to make suboptimal decisions. In order to formally reason about deception, we introduce the feature deception problem (FDP), a domain-independent model and present a learning and planning framework for finding the optimal deception strategy, taking into account the adversary's preferences which are initially unknown to the defender. We make the following contributions. (1) We show that we can uniformly learn the adversary's preferences using data from a modest number of deception strategies. (2) We propose an approximation algorithm for finding the optimal deception strategy given the learned preferences and show that the problem is NP-hard. (3) We perform extensive experiments to validate our methods and results. In addition, we provide a case study of the credit bureau network to illustrate how FDP implements deception on a real-world problem.

CRApr 6, 2019
Exploring the Attack Surface of Blockchain: A Systematic Overview

Muhammad Saad, Jeffrey Spaulding, Laurent Njilla et al.

In this paper, we systematically explore the attack surface of the Blockchain technology, with an emphasis on public Blockchains. Towards this goal, we attribute attack viability in the attack surface to 1) the Blockchain cryptographic constructs, 2) the distributed architecture of the systems using Blockchain, and 3) the Blockchain application context. To each of those contributing factors, we outline several attacks, including selfish mining, the 51% attack, Domain Name System (DNS) attacks, distributed denial-of-service (DDoS) attacks, consensus delay (due to selfish behavior or distributed denial-of-service attacks), Blockchain forks, orphaned and stale blocks, block ingestion, wallet thefts, smart contract attacks, and privacy attacks. We also explore the causal relationships between these attacks to demonstrate how various attack vectors are connected to one another. A secondary contribution of this work is outlining effective defense measures taken by the Blockchain technology or proposed by researchers to mitigate the effects of these attacks and patch associated vulnerabilities

CRNov 25, 2018
Countering Selfish Mining in Blockchains

Muhammad Saad, Laurent Njilla, Charles Kamhoua et al.

Selfish mining is a well known vulnerability in blockchains exploited by miners to steal block rewards. In this paper, we explore a new form of selfish mining attack that guarantees high rewards with low cost. We show the feasibility of this attack facilitated by recent developments in blockchain technology opening new attack avenues. By outlining the limitations of existing countermeasures, we highlight a need for new defense strategies to counter this attack, and leverage key system parameters in blockchain applications to propose an algorithm that enforces fair mining. We use the expected transaction confirmation height and block publishing height to detect selfish mining behavior and develop a network-wide defense mechanism to disincentivize selfish miners. Our design involves a simple modifications to transactions' data structure in order to obtain a "truth state" used to catch the selfish miners and prevent honest miners from losing block rewards.

ITMar 22, 2017
Hardware Trojan Detection Game: A Prospect-Theoretic Approach

Walid Saad, Anibal Sanjab, Yunpeng Wang et al.

Outsourcing integrated circuit (IC) manufacturing to offshore foundries has grown exponentially in recent years. Given the critical role of ICs in the control and operation of vehicular systems and other modern engineering designs, such offshore outsourcing has led to serious security threats due to the potential of insertion of hardware trojans - malicious designs that, when activated, can lead to highly detrimental consequences. In this paper, a novel game-theoretic framework is proposed to analyze the interactions between a hardware manufacturer, acting as attacker, and an IC testing facility, acting as defender. The problem is formulated as a noncooperative game in which the attacker must decide on the type of trojan that it inserts while taking into account the detection penalty as well as the damage caused by the trojan. Meanwhile, the resource-constrained defender must decide on the best testing strategy that allows optimizing its overall utility which accounts for both damages and the fines. The proposed game is based on the robust behavioral framework of prospect theory (PT) which allows capturing the potential uncertainty, risk, and irrational behavior in the decision making of both the attacker and defender. For both, the standard rational expected utility (EUT) case and the PT case, a novel algorithm based on fictitious play is proposed and shown to converge to a mixed-strategy Nash equilibrium. For an illustrative case study, thorough analytical results are derived for both EUT and PT to study the properties of the reached equilibrium as well as the impact of key system parameters such as the defender-set fine. Simulation results assess the performance of the proposed framework under both EUT and PT and show that the use of PT will provide invaluable insights on the outcomes of the proposed hardware trojan game, in particular, and system security, in general.

CRFeb 21, 2017
Contract-Theoretic Resource Allocation for Critical Infrastructure Protection

AbdelRahman Eldosouky, Walid Saad, Charles Kamhoua et al.

Critical infrastructure protection (CIP) is envisioned to be one of the most challenging security problems in the coming decade. One key challenge in CIP is the ability to allocate resources, either personnel or cyber, to critical infrastructures with different vulnerability and criticality levels. In this work, a contract-theoretic approach is proposed to solve the problem of resource allocation in critical infrastructure with asymmetric information. A control center (CC) is used to design contracts and offer them to infrastructures' owners. A contract can be seen as an agreement between the CC and infrastructures using which the CC allocates resources and gets rewards in return. Contracts are designed in a way to maximize the CC's benefit and motivate each infrastructure to accept a contract and obtain proper resources for its protection. Infrastructures are defined by both vulnerability levels and criticality levels which are unknown to the CC. Therefore, each infrastructure can claim that it is the most vulnerable or critical to gain more resources. A novel mechanism is developed to handle such an asymmetric information while providing the optimal contract that motivates each infrastructure to reveal its actual type. The necessary and sufficient conditions for such resource allocation contracts under asymmetric information are derived. Simulation results show that the proposed contract-theoretic approach maximizes the CC's utility while ensuring that no infrastructure has an incentive to ask for another contract, despite the lack of exact information at the CC.

CRFeb 2, 2017
Beyond Free Riding: Quality of Indicators for Assessing Participation in Information Sharing for Threat Intelligence

Omar Al-Ibrahim, Aziz Mohaisen, Charles Kamhoua et al.

Threat intelligence sharing has become a growing concept, whereby entities can exchange patterns of threats with each other, in the form of indicators, to a community of trust for threat analysis and incident response. However, sharing threat-related information have posed various risks to an organization that pertains to its security, privacy, and competitiveness. Given the coinciding benefits and risks of threat information sharing, some entities have adopted an elusive behavior of "free-riding" so that they can acquire the benefits of sharing without contributing much to the community. So far, understanding the effectiveness of sharing has been viewed from the perspective of the amount of information exchanged as opposed to its quality. In this paper, we introduce the notion of quality of indicators (\qoi) for the assessment of the level of contribution by participants in information sharing for threat intelligence. We exemplify this notion through various metrics, including correctness, relevance, utility, and uniqueness of indicators. In order to realize the notion of \qoi, we conducted an empirical study and taken a benchmark approach to define quality metrics, then we obtained a reference dataset and utilized tools from the machine learning literature for quality assessment. We compared these results against a model that only considers the volume of information as a metric for contribution, and unveiled various interesting observations, including the ability to spot low quality contributions that are synonym to free riding in threat information sharing.

CRFeb 2, 2017
Rethinking Information Sharing for Actionable Threat Intelligence

Aziz Mohaisen, Omar Al-Ibrahim, Charles Kamhoua et al.

In the past decade, the information security and threat landscape has grown significantly making it difficult for a single defender to defend against all attacks at the same time. This called for introduc- ing information sharing, a paradigm in which threat indicators are shared in a community of trust to facilitate defenses. Standards for representation, exchange, and consumption of indicators are pro- posed in the literature, although various issues are undermined. In this paper, we rethink information sharing for actionable intelli- gence, by highlighting various issues that deserve further explo- ration. We argue that information sharing can benefit from well- defined use models, threat models, well-understood risk by mea- surement and robust scoring, well-understood and preserved pri- vacy and quality of indicators and robust mechanism to avoid free riding behavior of selfish agent. We call for using the differential nature of data and community structures for optimizing sharing.