Tansu Alpcan

h-index48

26papers

197citations

Novelty53%

AI Score54

Ranked #27,310 of 201,326 authors (top 14%)#6,301 in LG (top 15%)

26 Papers

NIJun 2, 2022

Artificial Intelligence Techniques for Next-Generation Mega Satellite Networks

Bassel Al Homssi, Kosta Dakic, Ke Wang et al.

Space communications, particularly massive satellite networks, re-emerged as an appealing candidate for next generation networks due to major advances in space launching, electronics, processing power, and miniaturization. However, massive satellite networks rely on numerous underlying and intertwined processes that cannot be truly captured using conventionally used models, due to their dynamic and unique features such as orbital speed, inter-satellite links, short pass time, and satellite footprint, among others. Hence, new approaches are needed to enable the network to proactively adjust to the rapidly varying conditions associated within the link. Artificial intelligence (AI) provides a pathway to capture these processes, analyze their behavior, and model their effect on the network. This article introduces the application of AI techniques for integrated terrestrial satellite networks, particularly massive satellite network communications. It details the unique features of massive satellite networks, and the overarching challenges concomitant with their integration into the current communication infrastructure. Moreover, this article provides insights into state-of-the-art AI techniques across various layers of the communication link. This entails applying AI for forecasting the highly dynamic radio channel, spectrum sensing and classification, signal detection and demodulation, inter-satellite and satellite access network optimization, and network security. Moreover, future paradigms and the mapping of these mechanisms onto practical networks are outlined.

SYMay 3

Adaptive Network Security Policies via Belief Aggregation and Rollout

Kim Hammar, Yuchao Li, Tansu Alpcan et al.

Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. The method uses a model or simulator of the system, which is updated when changes occur, and combines three components: belief estimation through particle filtering, offline policy computation through feature-based aggregation, and online policy adaptation through rollout. In particular, feature-based aggregation enables scalable offline optimization of a policy, while rollout adapts the policy online to changes in the system model without repeating the offline optimization. We analyze the approximation error of the aggregation and show that the rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.

OCMay 11, 2011

A Framework for Optimization under Limited Information

Tansu Alpcan

In many real world problems, optimization decisions have to be made with limited information. The decision maker may have no a priori or posteriori data about the often nonconvex objective function except from on a limited number of points that are obtained over time through costly observations. This paper presents an optimization framework that takes into account the information collection (observation), estimation (regression), and optimization (maximization) aspects in a holistic and structured manner. Explicitly quantifying the information acquired at each optimization step using the entropy measure from information theory, the (nonconvex) objective function to be optimized (maximized) is modeled and estimated by adopting a Bayesian approach and using Gaussian processes as a state-of-the-art regression method. The resulting iterative scheme allows the decision maker to solve the problem by expressing preferences for each aspect quantitatively and concurrently.

OCMay 11, 2011

Dual Control with Active Learning using Gaussian Process Regression

Tansu Alpcan

In many real world problems, control decisions have to be made with limited information. The controller may have no a priori (or even posteriori) data on the nonlinear system, except from a limited number of points that are obtained over time. This is either due to high cost of observation or the highly non-stationary nature of the system. The resulting conflict between information collection (identification, exploration) and control (optimization, exploitation) necessitates an active learning approach for iteratively selecting the control actions which concurrently provide the data points for system identification. This paper presents a dual control approach where the information acquired at each control step is quantified using the entropy measure from information theory and serves as the training input to a state-of-the-art Gaussian process regression (Bayesian learning) method. The explicit quantification of the information obtained from each data point allows for iterative optimization of both identification and control objectives. The approach developed is illustrated with two examples: control of logistic map as a chaotic system and position control of a cart with inverted pendulum.

QUANT-PHSep 25, 2024

A Hybrid Quantum Neural Network for Split Learning

Hevish Cowlessur, Chandra Thapa, Tansu Alpcan et al.

Quantum Machine Learning (QML) is an emerging field of research with potential applications to distributed collaborative learning, such as Split Learning (SL). SL allows resource-constrained clients to collaboratively train ML models with a server, reduce their computational overhead, and enable data privacy by avoiding raw data sharing. Although QML with SL has been studied, the problem remains open in resource-constrained environments where clients lack quantum computing capabilities. Additionally, data privacy leakage between client and server in SL poses risks of reconstruction attacks on the server side. To address these issues, we propose Hybrid Quantum Split Learning (HQSL), an application of Hybrid QML in SL. HQSL enables classical clients to train models with a hybrid quantum server and curtails reconstruction attacks. Additionally, we introduce a novel qubit-efficient data-loading technique for designing a quantum layer in HQSL, minimizing both the number of qubits and circuit depth. Evaluations on real hardware demonstrate HQSL's practicality under realistic quantum noise. Experiments on five datasets demonstrate HQSL's feasibility and ability to enhance classification performance compared to its classical models. Notably, HQSL achieves mean improvements of over 3% in both accuracy and F1-score for the Fashion-MNIST dataset, and over 1.5% in both metrics for the Speech Commands dataset. We expand these studies to include up to 100 clients, confirming HQSL's scalability. Moreover, we introduce a noise-based defense mechanism to tackle reconstruction attacks on the server side. Overall, HQSL enables classical clients to train collaboratively with a hybrid quantum server, improving model performance and resistance against reconstruction attacks.

NISep 11, 2012

Competition and Regulation in Wireless Services Markets

Omer Korcak, George Iosifidis, Tansu Alpcan et al.

We consider a wireless services market where a set of operators compete for a large common pool of users. The latter have a reservation utility of U0 units or, equivalently, an alternative option to satisfy their communication needs. The operators must satisfy these minimum requirements in order to attract the users. We model the users decisions and interaction as an evolutionary game and the competition among the operators as a non cooperative price game which is proved to be a potential game. For each set of prices selected by the operators, the evolutionary game attains a different stationary point. We show that the outcome of both games depend on the reservation utility of the users and the amount of spectrum W the operators have at their disposal. We express the market welfare and the revenue of the operators as functions of these two parameters. Accordingly, we consider the scenario where a regulating agency is able to intervene and change the outcome of the market by tuning W and/or U0. Different regulators may have different objectives and criteria according to which they intervene. We analyze the various possible regulation methods and discuss their requirements, implications and impact on the market.

NIDec 13, 2011

Incentive Mechanisms for Hierarchical Spectrum Markets

George Iosifidis, Anil Kumar Chorppath, Tansu Alpcan et al.

In this paper, we study spectrum allocation mechanisms in hierarchical multi-layer markets which are expected to proliferate in the near future based on the current spectrum policy reform proposals. We consider a setting where a state agency sells spectrum channels to Primary Operators (POs) who subsequently resell them to Secondary Operators (SOs) through auctions. We show that these hierarchical markets do not result in a socially efficient spectrum allocation which is aimed by the agency, due to lack of coordination among the entities in different layers and the inherently selfish revenue-maximizing strategy of POs. In order to reconcile these opposing objectives, we propose an incentive mechanism which aligns the strategy and the actions of the POs with the objective of the agency, and thus leads to system performance improvement in terms of social welfare. This pricing-based scheme constitutes a method for hierarchical market regulation. A basic component of the proposed incentive mechanism is a novel auction scheme which enables POs to allocate their spectrum by balancing their derived revenue and the welfare of the SOs.

LGApr 15

Parameter-efficient Quantum Multi-task Learning

Hevish Cowlessur, Chandra Thapa, Tansu Alpcan et al.

Multi-task learning (MTL) improves generalization and data efficiency by jointly learning related tasks through shared representations. In the widely used hard-parameter-sharing setting, a shared backbone is combined with task-specific prediction heads. However, task-specific parameters can grow rapidly with the number of tasks. Therefore, designing multi-task heads that preserve task specialization while improving parameter efficiency remains a key challenge. In Quantum Machine Learning (QML), variational quantum circuits (VQCs) provide a compact mechanism for mapping classical data to quantum states residing in high-dimensional Hilbert spaces, enabling expressive representations within constrained parameter budgets. We propose a parameter-efficient quantum multi-task learning (QMTL) framework that replaces conventional task-specific linear heads with a fully quantum prediction head in a hybrid architecture. The model consists of a VQC with a shared, task-independent quantum encoding stage, followed by lightweight task-specific ansatz blocks enabling localized task adaptation while maintaining compact parameterization. Under a controlled and capacity-matched formulation where the shared representation dimension grows with the number of tasks, our parameter-scaling analysis demonstrates that a standard classical head exhibits quadratic growth, whereas the proposed quantum head parameter cost scales linearly. We evaluate QMTL on three multi-task benchmarks spanning natural language processing, medical imaging, and multimodal sarcasm detection, where we achieve performance comparable to, and in some cases exceeding, classical hard-parameter-sharing baselines while consistently outperforming existing hybrid quantum MTL models with substantially fewer head parameters. We further demonstrate QMTL's executability on noisy simulators and real quantum hardware, illustrating its feasibility.

LGMar 23, 2023

Failure-tolerant Distributed Learning for Anomaly Detection in Wireless Networks

Marc Katzef, Andrew C. Cullen, Tansu Alpcan et al.

The analysis of distributed techniques is often focused upon their efficiency, without considering their robustness (or lack thereof). Such a consideration is particularly important when devices or central servers can fail, which can potentially cripple distributed systems. When such failures arise in wireless communications networks, important services that they use/provide (like anomaly detection) can be left inoperable and can result in a cascade of security problems. In this paper, we present a novel method to address these risks by combining both flat- and star-topologies, combining the performance and reliability benefits of both. We refer to this method as "Tol-FL", due to its increased failure-tolerance as compared to the technique of Federated Learning. Our approach both limits device failure risks while outperforming prior methods by up to 8% in terms of anomaly detection AUROC in a range of realistic settings that consider client as well as server failure, all while reducing communication costs. This performance demonstrates that Tol-FL is a highly suitable method for distributed model training for anomaly detection, especially in the domain of wireless networks.

AIFeb 5

Hallucination-Resistant Security Planning with a Large Language Model

Kim Hammar, Tansu Alpcan, Emil Lupu

Large language models (LLMs) are promising tools for supporting security management tasks, such as incident response planning. However, their unreliability and tendency to hallucinate remain significant challenges. In this paper, we address these challenges by introducing a principled framework for using an LLM as decision support in security management. Our framework integrates the LLM in an iterative loop where it generates candidate actions that are checked for consistency with system constraints and lookahead predictions. When consistency is low, we abstain from the generated actions and instead collect external feedback, e.g., by evaluating actions in a digital twin. This feedback is then used to refine the candidate actions through in-context learning (ICL). We prove that this design allows to control the hallucination risk by tuning the consistency threshold. Moreover, we establish a bound on the regret of ICL under certain assumptions. To evaluate our framework, we apply it to an incident response use case where the goal is to generate a response and recovery plan based on system logs. Experiments on four public datasets show that our framework reduces recovery times by up to 30% compared to frontier LLMs.

LGMay 8

Fortifying Time Series: DTW-Certified Robust Anomaly Detection

Shijie Liu, Tansu Alpcan, Christopher Leckie et al.

Time-series anomaly detection is critical for ensuring safety in high-stakes applications, where robustness is a fundamental requirement rather than a mere performance metric. Addressing the vulnerability of these systems to adversarial manipulation is therefore essential. Existing defenses are largely heuristic or provide certified robustness only under $\ell_p$-norm constraints, which are incompatible with time-series data. In particular, $\ell_p$-norm fails to capture the intrinsic temporal structure in time series, causing small temporal distortions to significantly alter the $\ell_p$-norm measures. Instead, the similarity metric \emph{Dynamic Time Warping} (DTW) is more suitable and widely adopted in the time-series domain, as DTW accounts for temporal alignment and remains robust to temporal variations. To date, however, there has been no certifiable robustness result in this metric that provides guarantees. In this work, we introduce the first \emph{DTW-certified robust defense} in time-series anomaly detection by adapting the randomized smoothing paradigm. We develop this certificate by bridging the $\ell_p$-norm to DTW distance through a lower-bound transformation. Extensive experiments across various datasets and models validate the effectiveness and practicality of our theoretical approach. Results demonstrate significantly improved performance, e.g., up to 18.7\% in F1-score under DTW-based adversarial attacks compared to traditional certified models.

LGFeb 7, 2024

OIL-AD: An Anomaly Detection Framework for Sequential Decision Sequences

Chen Wang, Sarah Erfani, Tansu Alpcan et al.

Anomaly detection in decision-making sequences is a challenging problem due to the complexity of normality representation learning and the sequential nature of the task. Most existing methods based on Reinforcement Learning (RL) are difficult to implement in the real world due to unrealistic assumptions, such as having access to environment dynamics, reward signals, and online interactions with the environment. To address these limitations, we propose an unsupervised method named Offline Imitation Learning based Anomaly Detection (OIL-AD), which detects anomalies in decision-making sequences using two extracted behaviour features: action optimality and sequential association. Our offline learning model is an adaptation of behavioural cloning with a transformer policy network, where we modify the training process to learn a Q function and a state value function from normal trajectories. We propose that the Q function and the state value function can provide sufficient information about agents' behavioural data, from which we derive two features for anomaly detection. The intuition behind our method is that the action optimality feature derived from the Q function can differentiate the optimal action from others at each local state, and the sequential association feature derived from the state value function has the potential to maintain the temporal correlations between decisions (state-action pairs). Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in F1 score over comparable baselines.

CRAug 7, 2025

Incident Response Planning Using a Lightweight Large Language Model with Reduced Hallucination

Kim Hammar, Tansu Alpcan, Emil C. Lupu

Timely and effective incident response is key to managing the growing frequency of cyberattacks. However, identifying the right response actions for complex systems is a major technical challenge. A promising approach to mitigate this challenge is to use the security knowledge embedded in large language models (LLMs) to assist security operators during incident handling. Recent research has demonstrated the potential of this approach, but current methods are mainly based on prompt engineering of frontier LLMs, which is costly and prone to hallucinations. We address these limitations by presenting a novel way to use an LLM for incident response planning with reduced hallucination. Our method includes three steps: fine-tuning, information retrieval, and lookahead planning. We prove that our method generates response plans with a bounded probability of hallucination and that this probability can be made arbitrarily small at the expense of increased planning time under certain assumptions. Moreover, we show that our method is lightweight and can run on commodity hardware. We evaluate our method on logs from incidents reported in the literature. The experimental results show that our method a) achieves up to 22% shorter recovery times than frontier LLMs and b) generalizes to a broad range of incident types and response actions.

AISep 21, 2025

Intention-aware Hierarchical Diffusion Model for Long-term Trajectory Anomaly Detection

Chen Wang, Sarah Erfani, Tansu Alpcan et al.

Long-term trajectory anomaly detection is a challenging problem due to the diversity and complex spatiotemporal dependencies in trajectory data. Existing trajectory anomaly detection methods fail to simultaneously consider both the high-level intentions of agents as well as the low-level details of the agent's navigation when analysing an agent's trajectories. This limits their ability to capture the full diversity of normal trajectories. In this paper, we propose an unsupervised trajectory anomaly detection method named Intention-aware Hierarchical Diffusion model (IHiD), which detects anomalies through both high-level intent evaluation and low-level sub-trajectory analysis. Our approach leverages Inverse Q Learning as the high-level model to assess whether a selected subgoal aligns with an agent's intention based on predicted Q-values. Meanwhile, a diffusion model serves as the low-level model to generate sub-trajectories conditioned on subgoal information, with anomaly detection based on reconstruction error. By integrating both models, IHiD effectively utilises subgoal transition knowledge and is designed to capture the diverse distribution of normal trajectories. Our experiments show that the proposed method IHiD achieves up to 30.2% improvement in anomaly detection performance in terms of F1 score over state-of-the-art baselines.

QUANT-PHJun 24, 2025

A Qubit-Efficient Hybrid Quantum Encoding Mechanism for Quantum Machine Learning

Hevish Cowlessur, Tansu Alpcan, Chandra Thapa et al.

Efficiently embedding high-dimensional datasets onto noisy and low-qubit quantum systems is a significant barrier to practical Quantum Machine Learning (QML). Approaches such as quantum autoencoders can be constrained by current hardware capabilities and may exhibit vulnerabilities to reconstruction attacks due to their invertibility. We propose Quantum Principal Geodesic Analysis (qPGA), a novel, non-invertible method for dimensionality reduction and qubit-efficient encoding. Executed classically, qPGA leverages Riemannian geometry to project data onto the unit Hilbert sphere, generating outputs inherently suitable for quantum amplitude encoding. This technique preserves the neighborhood structure of high-dimensional datasets within a compact latent space, significantly reducing qubit requirements for amplitude encoding. We derive theoretical bounds quantifying qubit requirements for effective encoding onto noisy systems. Empirical results on MNIST, Fashion-MNIST, and CIFAR-10 show that qPGA preserves local structure more effectively than both quantum and hybrid autoencoders. Additionally, we demonstrate that qPGA enhances resistance to reconstruction attacks due to its non-invertible nature. In downstream QML classification tasks, qPGA can achieve over 99% accuracy and F1-score on MNIST and Fashion-MNIST, outperforming quantum-dependent baselines. Initial tests on real hardware and noisy simulators confirm its potential for noise-resilient performance, offering a scalable solution for advancing QML applications.

LGNov 11, 2024

Computable Model-Independent Bounds for Adversarial Quantum Machine Learning

Bacui Li, Tansu Alpcan, Chandra Thapa et al.

By leveraging the principles of quantum mechanics, QML opens doors to novel approaches in machine learning and offers potential speedup. However, machine learning models are well-documented to be vulnerable to malicious manipulations, and this susceptibility extends to the models of QML. This situation necessitates a thorough understanding of QML's resilience against adversarial attacks, particularly in an era where quantum computing capabilities are expanding. In this regard, this paper examines model-independent bounds on adversarial performance for QML. To the best of our knowledge, we introduce the first computation of an approximate lower bound for adversarial error when evaluating model resilience against sophisticated quantum-based adversarial attacks. Experimental results are compared to the computed bound, demonstrating the potential of QML models to achieve high robustness. In the best case, the experimental error is only 10% above the estimated bound, offering evidence of the inherent robustness of quantum models. This work not only advances our theoretical understanding of quantum model resilience but also provides a precise reference bound for the future development of robust QML algorithms.

CLApr 28, 2024

Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression

Li Wan, Tansu Alpcan, Margreta Kuijper et al.

We propose a novel, lightweight supervised dictionary learning framework for text classification based on data compression and representation. This two-phase algorithm initially employs the Lempel-Ziv-Welch (LZW) algorithm to construct a dictionary from text datasets, focusing on the conceptual significance of dictionary elements. Subsequently, dictionaries are refined considering label data, optimizing dictionary atoms to enhance discriminative power based on mutual information and class distribution. This process generates discriminative numerical representations, facilitating the training of simple classifiers such as SVMs and neural networks. We evaluate our algorithm's information-theoretic performance using information bottleneck principles and introduce the information plane area rank (IPAR) as a novel metric to quantify the information-theoretic performance. Tested on six benchmark text datasets, our algorithm competes closely with top models, especially in limited-vocabulary contexts, using significantly fewer parameters. \review{Our algorithm closely matches top-performing models, deviating by only ~2\% on limited-vocabulary datasets, using just 10\% of their parameters. However, it falls short on diverse-vocabulary datasets, likely due to the LZW algorithm's constraints with low-repetition data. This contrast highlights its efficiency and limitations across different dataset types.

CRDec 4, 2021

A Game-Theoretic Approach for AI-based Botnet Attack Defence

Hooman Alavizadeh, Julian Jang-Jaccard, Tansu Alpcan et al.

The new generation of botnets leverages Artificial Intelligent (AI) techniques to conceal the identity of botmasters and the attack intention to avoid detection. Unfortunately, there has not been an existing assessment tool capable of evaluating the effectiveness of existing defense strategies against this kind of AI-based botnet attack. In this paper, we propose a sequential game theory model that is capable to analyse the details of the potential strategies botnet attackers and defenders could use to reach Nash Equilibrium (NE). The utility function is computed under the assumption when the attacker launches the maximum number of DDoS attacks with the minimum attack cost while the defender utilises the maximum number of defense strategies with the minimum defense cost. We conduct a numerical analysis based on a various number of defense strategies involved on different (simulated) cloud-band sizes in relation to different attack success rate values. Our experimental results confirm that the success of defense highly depends on the number of defense strategies used according to careful evaluation of attack rates.

GTSep 29, 2021

A Communication Security Game on Switched Systems for Autonomous Vehicle Platoons

Guoxin Sun, Tansu Alpcan, Benjamin I. P. Rubinstein et al.

Vehicle-to-vehicle communication enables autonomous platoons to boost traffic efficiency and safety, while ensuring string stability with a constant spacing policy. However, communication-based controllers are susceptible to a range of cyber-attacks. In this paper, we propose a distributed attack mitigation defense framework with a dual-mode control system reconfiguration scheme to prevent a compromised platoon member from causing collisions via message falsification attacks. In particular, we model it as a switched system consisting of a communication-based cooperative controller and a sensor-based local controller and derive conditions to achieve global uniform exponential stability (GUES) as well as string stability in the sense of platoon operation. The switching decision comes from game-theoretic analysis of the attacker and the defender's interactions. In this framework, the attacker acts as a leader that chooses whether to engage in malicious activities and the defender decides which control system to deploy with the help of an anomaly detector. Imperfect detection reports associate the game with imperfect information. A dedicated state constraint further enhances safety against bounded but aggressive message modifications in which a bounded solution may still violate practical constraint e.g. vehicles nearly crashing. Our formulation uniquely combines switched systems with security games to strategically improve the safety of such autonomous vehicle systems.

LGSep 24, 2021

Local Intrinsic Dimensionality Signals Adversarial Perturbations

Sandamal Weerasinghe, Tansu Alpcan, Sarah M. Erfani et al.

The vulnerability of machine learning models to adversarial perturbations has motivated a significant amount of research under the broad umbrella of adversarial machine learning. Sophisticated attacks may cause learning algorithms to learn decision functions or make decisions with poor predictive performance. In this context, there is a growing body of literature that uses local intrinsic dimensionality (LID), a local metric that describes the minimum number of latent variables required to describe each data point, for detecting adversarial samples and subsequently mitigating their effects. The research to date has tended to focus on using LID as a practical defence method often without fully explaining why LID can detect adversarial samples. In this paper, we derive a lower-bound and an upper-bound for the LID value of a perturbed data point and demonstrate that the bounds, in particular the lower-bound, has a positive correlation with the magnitude of the perturbation. Hence, we demonstrate that data points that are perturbed by a large amount would have large LID values compared to unperturbed samples, thus justifying its use in the prior literature. Furthermore, our empirical validation demonstrates the validity of the bounds on benchmark datasets.

LGAug 21, 2020

Defending Distributed Classifiers Against Data Poisoning Attacks

Sandamal Weerasinghe, Tansu Alpcan, Sarah M. Erfani et al.

Support Vector Machines (SVMs) are vulnerable to targeted training data manipulations such as poisoning attacks and label flips. By carefully manipulating a subset of training samples, the attacker forces the learner to compute an incorrect decision boundary, thereby cause misclassifications. Considering the increased importance of SVMs in engineering and life-critical applications, we develop a novel defense algorithm that improves resistance against such attacks. Local Intrinsic Dimensionality (LID) is a promising metric that characterizes the outlierness of data samples. In this work, we introduce a new approximation of LID called K-LID that uses kernel distance in the LID calculation, which allows LID to be calculated in high dimensional transformed spaces. We introduce a weighted SVM against such attacks using K-LID as a distinguishing characteristic that de-emphasizes the effect of suspicious data samples on the SVM decision boundary. Each sample is weighted on how likely its K-LID value is from the benign K-LID distribution rather than the attacked K-LID distribution. We then demonstrate how the proposed defense can be applied to a distributed SVM framework through a case study on an SDR-based surveillance system. Experiments with benchmark data sets show that the proposed defense reduces classification error rates substantially (10% on average).

LGAug 21, 2020

Defending Regression Learners Against Poisoning Attacks

Sandamal Weerasinghe, Sarah M. Erfani, Tansu Alpcan et al.

Regression models, which are widely used from engineering applications to financial forecasting, are vulnerable to targeted malicious attacks such as training data poisoning, through which adversaries can manipulate their predictions. Previous works that attempt to address this problem rely on assumptions about the nature of the attack/attacker or overestimate the knowledge of the learner, making them impractical. We introduce a novel Local Intrinsic Dimensionality (LID) based measure called N-LID that measures the local deviation of a given data point's LID with respect to its neighbors. We then show that N-LID can distinguish poisoned samples from normal samples and propose an N-LID based defense approach that makes no assumptions of the attacker. Through extensive numerical experiments with benchmark datasets, we show that the proposed defense mechanism outperforms the state of the art defenses in terms of prediction accuracy (up to 76% lower MSE compared to an undefended ridge model) and running time.

MLFeb 25, 2019

Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence

Yi Han, David Hubczenko, Paul Montague et al.

Recent studies have demonstrated that reinforcement learning (RL) agents are susceptible to adversarial manipulation, similar to vulnerabilities previously demonstrated in the supervised learning setting. While most existing work studies the problem in the context of computer vision or console games, this paper focuses on reinforcement learning in autonomous cyber defence under partial observability. We demonstrate that under the black-box setting, where the attacker has no direct access to the target RL model, causative attacks---attacks that target the training process---can poison RL agents even if the attacker only has partial observability of the environment. In addition, we propose an inversion defence method that aims to apply the opposite perturbation to that which an attacker might use to generate their adversarial samples. Our experimental results illustrate that the countermeasure can effectively reduce the impact of the causative attack, while not significantly affecting the training process in non-attack scenarios.

CRAug 17, 2018

Reinforcement Learning for Autonomous Defence in Software-Defined Networking

Yi Han, Benjamin I. P. Rubinstein, Tamas Abraham et al.

Despite the successful application of machine learning (ML) in a wide range of domains, adaptability---the very property that makes machine learning desirable---can be exploited by adversaries to contaminate training and evade classification. In this paper, we investigate the feasibility of applying a specific class of machine learning algorithms, namely, reinforcement learning (RL) algorithms, for autonomous cyber defence in software-defined networking (SDN). In particular, we focus on how an RL agent reacts towards different forms of causative attacks that poison its training process, including indiscriminate and targeted, white-box and black-box attacks. In addition, we also study the impact of the attack timing, and explore potential countermeasures such as adversarial training.

AIJul 28, 2017

Toward the Starting Line: A Systems Engineering Approach to Strong AI

Tansu Alpcan, Sarah M. Erfani, Christopher Leckie

Artificial General Intelligence (AGI) or Strong AI aims to create machines with human-like or human-level intelligence, which is still a very ambitious goal when compared to the existing computing and AI systems. After many hype cycles and lessons from AI history, it is clear that a big conceptual leap is needed for crossing the starting line to kick-start mainstream AGI research. This position paper aims to make a small conceptual contribution toward reaching that starting line. After a broad analysis of the AGI problem from different perspectives, a system-theoretic and engineering-based research approach is introduced, which builds upon the existing mainstream AI and systems foundations. Several promising cross-fertilization opportunities between systems disciplines and AI research are identified. Specific potential research directions are discussed.

GTSep 21, 2016

Large-Scale Strategic Games and Adversarial Machine Learning

Tansu Alpcan, Benjamin I. P. Rubinstein, Christopher Leckie

Decision making in modern large-scale and complex systems such as communication networks, smart electricity grids, and cyber-physical systems motivate novel game-theoretic approaches. This paper investigates big strategic (non-cooperative) games where a finite number of individual players each have a large number of continuous decision variables and input data points. Such high-dimensional decision spaces and big data sets lead to computational challenges, relating to efforts in non-linear optimization scaling up to large systems of variables. In addition to these computational challenges, real-world players often have limited information about their preference parameters due to the prohibitive cost of identifying them or due to operating in dynamic online settings. The challenge of limited information is exacerbated in high dimensions and big data sets. Motivated by both computational and information limitations that constrain the direct solution of big strategic games, our investigation centers around reductions using linear transformations such as random projection methods and their effect on Nash equilibrium solutions. Specific analytical results are presented for quadratic games and approximations. In addition, an adversarial learning game is presented where random projection and sampling schemes are investigated.