Shahrear Iqbal

CR
h-index42
9papers
129citations
Novelty42%
AI Score49

9 Papers

CRMay 28
SAMD: A Tool for Identifying False Data Injection Scenarios in AI/ML-enabled Medical Devices

Mohammadreza Hallajiyan, Xueren Ge, Athish Pranav Dharmalingam et al.

The growing integration of artificial intelligence (AI) and machine learning (ML) in medical systems requires effective measures to address emerging security risks. One such risk is that of adversaries introducing false data through vulnerable system components during inference, causing misdiagnosis and wrong treatments. These risks are challenging to anticipate and address in the design phase, as the system assembly partially occurs during actual use by end users. To address this concern, we introduce SAMD, an automated tool for performing System Theoretic Process Analysis for Security (STPA-Sec) on AI/ML-enabled medical devices during the design phase. SAMD models the medical system as a control structure, treating all system components as potential points for injecting false data into the ML engine. It leverages state-of-the-art vulnerability databases and Large Language Models (LLMs) to automate vulnerability discovery and generate a list of potential attack scenarios. We demonstrate SAMD's effectiveness through case studies on five FDA-cleared medical devices, showcasing its ability to identify vulnerable points and potential attack paths. We find that SAMD has 100% precision in identifying target device technologies in the case studies' documents, retrieves the known vulnerabilities linked to them (with 63.2% precision), and generates highly relevant attack scenarios on the ML model, including detailed steps that an adversary might take (with 95.3% accuracy, and the highest time taken being 191.64s).

AIJan 22
Improving Methodologies for Agentic Evaluations Across Domains: Leakage of Sensitive Information, Fraud and Cybersecurity Threats

Ee Wei Seah, Yongsen Zheng, Naga Nikshith et al.

The rapid rise of autonomous AI systems and advancements in agent capabilities are introducing new risks due to reduced oversight of real-world interactions. Yet agent testing remains nascent and is still a developing science. As AI agents begin to be deployed globally, it is important that they handle different languages and cultures accurately and securely. To address this, participants from The International Network for Advanced AI Measurement, Evaluation and Science, including representatives from Singapore, Japan, Australia, Canada, the European Commission, France, Kenya, South Korea, and the United Kingdom have come together to align approaches to agentic evaluations. This is the third exercise, building on insights from two earlier joint testing exercises conducted by the Network in November 2024 and February 2025. The objective is to further refine best practices for testing advanced AI systems. The exercise was split into two strands: (1) common risks, including leakage of sensitive information and fraud, led by Singapore AISI; and (2) cybersecurity, led by UK AISI. A mix of open and closed-weight models were evaluated against tasks from various public agentic benchmarks. Given the nascency of agentic testing, our primary focus was on understanding methodological issues in conducting such tests, rather than examining test results or model capabilities. This collaboration marks an important step forward as participants work together to advance the science of agentic evaluations.

CRNov 9, 2023
LogShield: A Transformer-based APT Detection System Leveraging Self-Attention

Sihat Afnan, Mushtari Sadia, Shahrear Iqbal et al.

Cyber attacks are often identified using system and network logs. There have been significant prior works that utilize provenance graphs and ML techniques to detect attacks, specifically advanced persistent threats, which are very difficult to detect. Lately, there have been studies where transformer-based language models are being used to detect various types of attacks from system logs. However, no such attempts have been made in the case of APTs. In addition, existing state-of-the-art techniques that use system provenance graphs, lack a data processing framework generalized across datasets for optimal performance. For mitigating this limitation as well as exploring the effectiveness of transformer-based language models, this paper proposes LogShield, a framework designed to detect APT attack patterns leveraging the power of self-attention in transformers. We incorporate customized embedding layers to effectively capture the context of event sequences derived from provenance graphs. While acknowledging the computational overhead associated with training transformer networks, our framework surpasses existing LSTM and Language models regarding APT detection. We integrated the model parameters and training procedure from the RoBERTa model and conducted extensive experiments on well-known APT datasets (DARPA OpTC and DARPA TC E3). Our framework achieved superior F1 scores of 98% and 95% on the two datasets respectively, surpassing the F1 scores of 96% and 94% obtained by LSTM models. Our findings suggest that LogShield's performance benefits from larger datasets and demonstrates its potential for generalization across diverse domains. These findings contribute to the advancement of APT attack detection methods and underscore the significance of transformer-based architectures in addressing security challenges in computer systems.

CRMar 27
ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks

Mohammed Elnawawy, Gargi Mitra, Shahrear Iqbal et al.

Safety-critical domains like healthcare rely on deep neural networks (DNNs) for prediction, yet DNNs remain vulnerable to evasion attacks. Anomaly detectors (ADs) are widely used to protect DNNs, but conventional ADs are trained indiscriminately on benign data from all patients, overlooking physiological differences that introduce noise, degrade robustness, and reduce recall. In this paper, we propose ROAST, a novel risk-aware outlier exposure selective training framework that improves AD recall without sacrificing precision. ROAST identifies patients who are less vulnerable to attack and focuses training on these cleaner, more reliable data, thereby reducing false negatives and improving recall. To preserve precision, the framework applies outlier exposure by injecting adversarial samples into the training set of the less vulnerable patients, avoiding noisy data from others. Experiments show that ROAST increases recall by 16.2\% while reducing the training time by 88.3\% on average compared to indiscriminate training, with minimal impact on precision.

CRJan 30, 2024
Systematically Assessing the Security Risks of AI/ML-enabled Connected Healthcare Systems

Mohammed Elnawawy, Mohammadreza Hallajiyan, Gargi Mitra et al.

The adoption of machine-learning-enabled systems in the healthcare domain is on the rise. While the use of ML in healthcare has several benefits, it also expands the threat surface of medical systems. We show that the use of ML in medical systems, particularly connected systems that involve interfacing the ML engine with multiple peripheral devices, has security risks that might cause life-threatening damage to a patient's health in case of adversarial interventions. These new risks arise due to security vulnerabilities in the peripheral devices and communication channels. We present a case study where we demonstrate an attack on an ML-enabled blood glucose monitoring system by introducing adversarial data points during inference. We show that an adversary can achieve this by exploiting a known vulnerability in the Bluetooth communication channel connecting the glucose meter with the ML-enabled app. We further show that state-of-the-art risk assessment techniques are not adequate for identifying and assessing these new risks. Our study highlights the need for novel risk analysis methods for analyzing the security of AI-enabled connected health devices.

CRJun 18, 2025
Systems-Theoretic and Data-Driven Security Analysis in ML-enabled Medical Devices

Gargi Mitra, Mohammadreza Hallajiyan, Inji Kim et al.

The integration of AI/ML into medical devices is rapidly transforming healthcare by enhancing diagnostic and treatment facilities. However, this advancement also introduces serious cybersecurity risks due to the use of complex and often opaque models, extensive interconnectivity, interoperability with third-party peripheral devices, Internet connectivity, and vulnerabilities in the underlying technologies. These factors contribute to a broad attack surface and make threat prevention, detection, and mitigation challenging. Given the highly safety-critical nature of these devices, a cyberattack on these devices can cause the ML models to mispredict, thereby posing significant safety risks to patients. Therefore, ensuring the security of these devices from the time of design is essential. This paper underscores the urgency of addressing the cybersecurity challenges in ML-enabled medical devices at the pre-market phase. We begin by analyzing publicly available data on device recalls and adverse events, and known vulnerabilities, to understand the threat landscape of AI/ML-enabled medical devices and their repercussions on patient safety. Building on this analysis, we introduce a suite of tools and techniques designed by us to assist security analysts in conducting comprehensive premarket risk assessments. Our work aims to empower manufacturers to embed cybersecurity as a core design principle in AI/ML-enabled medical devices, thereby making them safe for patients.

CRDec 21, 2021
ANUBIS: A Provenance Graph-Based Framework for Advanced Persistent Threat Detection

Md. Monowar Anjum, Shahrear Iqbal, Benoit Hamelin

We present ANUBIS, a highly effective machine learning-based APT detection system. Our design philosophy for ANUBIS involves two principal components. Firstly, we intend ANUBIS to be effectively utilized by cyber-response teams. Therefore, prediction explainability is one of the main focuses of ANUBIS design. Secondly, ANUBIS uses system provenance graphs to capture causality and thereby achieve high detection performance. At the core of the predictive capability of ANUBIS, there is a Bayesian Neural Network that can tell how confident it is in its predictions. We evaluate ANUBIS against a recent APT dataset (DARPA OpTC) and show that ANUBIS can detect malicious activity akin to APT campaigns with high accuracy. Moreover, ANUBIS learns about high-level patterns that allow it to explain its predictions to threat analysts. The high predictive performance with explainable attack story reconstruction makes ANUBIS an effective tool to use for enterprise cyber defense.

CRMar 4, 2021
Analyzing the Usefulness of the DARPA OpTC Dataset in Cyber Threat Detection Research

Md. Monowar Anjum, Shahrear Iqbal, Benoit Hamelin

Maintaining security and privacy in real-world enterprise networks is becoming more and more challenging. Cyber actors are increasingly employing previously unreported and state-of-the-art techniques to break into corporate networks. To develop novel and effective methods to thwart these sophisticated cyberattacks, we need datasets that reflect real-world enterprise scenarios to a high degree of accuracy. However, precious few such datasets are publicly available. Researchers still predominantly use the decade-old KDD datasets, however, studies showed that these datasets do not adequately reflect modern attacks like Advanced Persistent Threats(APT). In this work, we analyze the usefulness of the recently introduced DARPA Operationally Transparent Cyber (OpTC) dataset in this regard. We describe the content of the dataset in detail and present a qualitative analysis. We show that the OpTC dataset is an excellent candidate for advanced cyber threat detection research while also highlighting its limitations. Additionally, we propose several research directions where this dataset can be useful.

LGJan 8, 2021
Towards a Robust and Trustworthy Machine Learning System Development: An Engineering Perspective

Pulei Xiong, Scott Buffett, Shahrear Iqbal et al.

While Machine Learning (ML) technologies are widely adopted in many mission critical fields to support intelligent decision-making, concerns remain about system resilience against ML-specific security attacks and privacy breaches as well as the trust that users have in these systems. In this article, we present our recent systematic and comprehensive survey on the state-of-the-art ML robustness and trustworthiness from a security engineering perspective, focusing on the problems in system threat analysis, design and evaluation faced in developing practical machine learning applications, in terms of robustness and user trust. Accordingly, we organize the presentation of this survey intended to facilitate the convey of the body of knowledge from this angle. We then describe a metamodel we created that represents the body of knowledge in a standard and visualized way. We further illustrate how to leverage the metamodel to guide a systematic threat analysis and security design process which extends and scales up the classic process. Finally, we propose the future research directions motivated by our findings. Our work differs itself from the existing surveys by (i) exploring the fundamental principles and best practices to support robust and trustworthy ML system development, and (ii) studying the interplay of robustness and user trust in the context of ML systems. We expect this survey provides a big picture for machine learning security practitioners.