Mahshid Helali Moghadam

SE
h-index13
12papers
181citations
Novelty42%
AI Score35

12 Papers

SEApr 16, 2022Code
Ergo, SMIRK is Safe: A Safety Case for a Machine Learning Component in a Pedestrian Automatic Emergency Brake System

Markus Borg, Jens Henriksson, Kasper Socha et al.

Integration of Machine Learning (ML) components in critical applications introduces novel challenges for software certification and verification. New safety standards and technical guidelines are under development to support the safety of ML-based systems, e.g., ISO 21448 SOTIF for the automotive domain and the Assurance of Machine Learning for use in Autonomous Systems (AMLAS) framework. SOTIF and AMLAS provide high-level guidance but the details must be chiseled out for each specific case. We initiated a research project with the goal to demonstrate a complete safety case for an ML component in an open automotive system. This paper reports results from an industry-academia collaboration on safety assurance of SMIRK, an ML-based pedestrian automatic emergency braking demonstrator running in an industry-grade simulator. We demonstrate an application of AMLAS on SMIRK for a minimalistic operational design domain, i.e., we share a complete safety case for its integrated ML-based component. Finally, we report lessons learned and provide both SMIRK and the safety case under an open-source licence for the research community to reuse.

SEMar 22, 2022
Machine Learning Testing in an ADAS Case Study Using Simulation-Integrated Bio-Inspired Search-Based Testing

Mahshid Helali Moghadam, Markus Borg, Mehrdad Saadatmand et al.

This paper presents an extended version of Deeper, a search-based simulation-integrated test solution that generates failure-revealing test scenarios for testing a deep neural network-based lane-keeping system. In the newly proposed version, we utilize a new set of bio-inspired search algorithms, genetic algorithm (GA), $(μ+λ)$ and $(μ,λ)$ evolution strategies (ES), and particle swarm optimization (PSO), that leverage a quality population seed and domain-specific cross-over and mutation operations tailored for the presentation model used for modeling the test scenarios. In order to demonstrate the capabilities of the new test generators within Deeper, we carry out an empirical evaluation and comparison with regard to the results of five participating tools in the cyber-physical systems testing competition at SBST 2021. Our evaluation shows the newly proposed test generators in Deeper not only represent a considerable improvement on the previous version but also prove to be effective and efficient in provoking a considerable number of diverse failure-revealing test scenarios for testing an ML-driven lane-keeping system. They can trigger several failures while promoting test scenario diversity, under a limited test time budget, high target failure severity, and strict speed limit constraints.

CRMay 11, 2023Code
Anomaly Detection Dataset for Industrial Control Systems

Alireza Dehlaghi-Ghadim, Mahshid Helali Moghadam, Ali Balador et al.

Over the past few decades, Industrial Control Systems (ICSs) have been targeted by cyberattacks and are becoming increasingly vulnerable as more ICSs are connected to the internet. Using Machine Learning (ML) for Intrusion Detection Systems (IDS) is a promising approach for ICS cyber protection, but the lack of suitable datasets for evaluating ML algorithms is a challenge. Although there are a few commonly used datasets, they may not reflect realistic ICS network data, lack necessary features for effective anomaly detection, or be outdated. This paper presents the 'ICS-Flow' dataset, which offers network data and process state variables logs for supervised and unsupervised ML-based IDS assessment. The network data includes normal and anomalous network packets and flows captured from simulated ICS components and emulated networks. The anomalies were injected into the system through various attack techniques commonly used by hackers to modify network traffic and compromise ICSs. We also proposed open-source tools, `ICSFlowGenerator' for generating network flow parameters from Raw network packets. The final dataset comprises over 25,000,000 raw network packets, network flow records, and process variable logs. The paper describes the methodology used to collect and label the dataset and provides a detailed data analysis. Finally, we implement several ML models, including the decision tree, random forest, and artificial neural network to detect anomalies and attacks, demonstrating that our dataset can be used effectively for training intrusion detection ML models.

CRJul 3, 2025
Alleviating Attack Data Scarcity: SCANIA's Experience Towards Enhancing In-Vehicle Cyber Security Measures

Frida Sundfeldt, Bianca Widstam, Mahshid Helali Moghadam et al.

The digital evolution of connected vehicles and the subsequent security risks emphasize the critical need for implementing in-vehicle cyber security measures such as intrusion detection and response systems. The continuous advancement of attack scenarios further highlights the need for adaptive detection mechanisms that can detect evolving, unknown, and complex threats. The effective use of ML-driven techniques can help address this challenge. However, constraints on implementing diverse attack scenarios on test vehicles due to safety, cost, and ethical considerations result in a scarcity of data representing attack scenarios. This limitation necessitates alternative efficient and effective methods for generating high-quality attack-representing data. This paper presents a context-aware attack data generator that generates attack inputs and corresponding in-vehicle network log, i.e., controller area network (CAN) log, representing various types of attack including denial of service (DoS), fuzzy, spoofing, suspension, and replay attacks. It utilizes parameterized attack models augmented with CAN message decoding and attack intensity adjustments to configure the attack scenarios with high similarity to real-world scenarios and promote variability. We evaluate the practicality of the generated attack-representing data within an intrusion detection system (IDS) case study, in which we develop and perform an empirical evaluation of two deep neural network IDS models using the generated data. In addition to the efficiency and scalability of the approach, the performance results of IDS models, high detection and classification capabilities, validate the consistency and effectiveness of the generated data as well. In this experience study, we also elaborate on the aspects influencing the fidelity of the data to real-world scenarios and provide insights into its application.

LGDec 30, 2023
A Maritime Industry Experience for Vessel Operational Anomaly Detection: Utilizing Deep Learning Augmented with Lightweight Interpretable Models

Mahshid Helali Moghadam, Mateusz Rzymowski, Lukasz Kulas

This study presents an industry experience showcasing a vessel operational anomaly detection approach that utilizes semi-supervised deep learning models augmented with lightweight interpretable surrogate models, applied to an industrial sensorized vessel, called TUCANA. We leverage standard and Long Short-Term Memory (LSTM) autoencoders trained on normal operational data and tested with real anomaly-revealing data. We then provide a projection of the inference results on a lower-dimension data map generated by t-distributed stochastic neighbor embedding (t-SNE), which serves as an unsupervised baseline and shows the distribution of the identified anomalies. We also develop lightweight surrogate models using random forest and decision tree to promote transparency and interpretability for the inference results of the deep learning models and assist the engineer with an agile assessment of the flagged anomalies. The approach is empirically evaluated using real data from TUCANA. The empirical results show higher performance of the LSTM autoencoder -- as the anomaly detection module with effective capturing of temporal dependencies in the data -- and demonstrate the practicality of the lightweight surrogate models in providing helpful interpretability, which leads to higher efficiency for the engineer's decision-making.

NENov 19, 2021
HMS-OS: Improving the Human Mental Search Optimisation Algorithm by Grouping in both Search and Objective Space

Seyed Jalaleddin Mousavirad, Gerald Schaefer, Iakov Korovin et al.

The human mental search (HMS) algorithm is a relatively recent population-based metaheuristic algorithm, which has shown competitive performance in solving complex optimisation problems. It is based on three main operators: mental search, grouping, and movement. In the original HMS algorithm, a clustering algorithm is used to group the current population in order to identify a promising region in search space, while candidate solutions then move towards the best candidate solution in the promising region. In this paper, we propose a novel HMS algorithm, HMS-OS, which is based on clustering in both objective and search space, where clustering in objective space finds a set of best candidate solutions whose centroid is then also used in updating the population. For further improvement, HMSOS benefits from an adaptive selection of the number of mental processes in the mental search operator. Experimental results on CEC-2017 benchmark functions with dimensionalities of 50 and 100, and in comparison to other optimisation algorithms, indicate that HMS-OS yields excellent performance, superior to those of other methods.

LGOct 17, 2021
An LSTM-based Plagiarism Detection via Attention Mechanism and a Population-based Approach for Pre-Training Parameters with imbalanced Classes

Seyed Vahid Moravvej, Seyed Jalaleddin Mousavirad, Mahshid Helali Moghadam et al.

Plagiarism is one of the leading problems in academic and industrial environments, which its goal is to find the similar items in a typical document or source code. This paper proposes an architecture based on a Long Short-Term Memory (LSTM) and attention mechanism called LSTM-AM-ABC boosted by a population-based approach for parameter initialization. Gradient-based optimization algorithms such as back-propagation (BP) are widely used in the literature for learning process in LSTM, attention mechanism, and feed-forward neural network, while they suffer from some problems such as getting stuck in local optima. To tackle this problem, population-based metaheuristic (PBMH) algorithms can be used. To this end, this paper employs a PBMH algorithm, artificial bee colony (ABC), to moderate the problem. Our proposed algorithm can find the initial values for model learning in all LSTM, attention mechanism, and feed-forward neural network, simultaneously. In other words, ABC algorithm finds a promising point for starting BP algorithm. For evaluation, we compare our proposed algorithm with both conventional and population-based methods. The results clearly show that the proposed method can provide competitive performance.

NESep 20, 2021
An Enhanced Differential Evolution Algorithm Using a Novel Clustering-based Mutation Operator

Seyed Jalaleddin Mousavirad, Gerald Schaefer, Iakov Korovin et al.

Differential evolution (DE) is an effective population-based metaheuristic algorithm for solving complex optimisation problems. However, the performance of DE is sensitive to the mutation operator. In this paper, we propose a novel DE algorithm, Clu-DE, that improves the efficacy of DE using a novel clustering-based mutation operator. First, we find, using a clustering algorithm, a winner cluster in search space and select the best candidate solution in this cluster as the base vector in the mutation operator. Then, an updating scheme is introduced to include new candidate solutions in the current population. Experimental results on CEC-2017 benchmark functions with dimensionalities of 30, 50 and 100 confirm that Clu-DE yields improved performance compared to DE.

AISep 16, 2021
Efficient and Effective Generation of Test Cases for Pedestrian Detection -- Search-based Software Testing of Baidu Apollo in SVL

Hamid Ebadi, Mahshid Helali Moghadam, Markus Borg et al.

With the growing capabilities of autonomous vehicles, there is a higher demand for sophisticated and pragmatic quality assurance approaches for machine learning-enabled systems in the automotive AI context. The use of simulation-based prototyping platforms provides the possibility for early-stage testing, enabling inexpensive testing and the ability to capture critical corner-case test scenarios. Simulation-based testing properly complements conventional on-road testing. However, due to the large space of test input parameters in these systems, the efficient generation of effective test scenarios leading to the unveiling of failures is a challenge. This paper presents a study on testing pedestrian detection and emergency braking system of the Baidu Apollo autonomous driving platform within the SVL simulator. We propose an evolutionary automated test generation technique that generates failure-revealing scenarios for Apollo in the SVL environment. Our approach models the input space using a generic and flexible data structure and benefits a multi-criteria safety-based heuristic for the objective function targeted for optimization. This paper presents the results of our proposed test generation technique in the 2021 IEEE Autonomous Driving AI Test Challenge. In order to demonstrate the efficiency and effectiveness of our approach, we also report the results from a baseline random generation technique. Our evaluation shows that the proposed evolutionary test case generator is more effective at generating failure-revealing test cases and provides higher diversity between the generated failures than the random baseline.

SEApr 26, 2021
Performance Testing Using a Smart Reinforcement Learning-Driven Test Agent

Mahshid Helali Moghadam, Golrokh Hamidi, Markus Borg et al.

Performance testing with the aim of generating an efficient and effective workload to identify performance issues is challenging. Many of the automated approaches mainly rely on analyzing system models, source code, or extracting the usage pattern of the system during the execution. However, such information and artifacts are not always available. Moreover, all the transactions within a generated workload do not impact the performance of the system the same way, a finely tuned workload could accomplish the test objective in an efficient way. Model-free reinforcement learning is widely used for finding the optimal behavior to accomplish an objective in many decision-making problems without relying on a model of the system. This paper proposes that if the optimal policy (way) for generating test workload to meet a test objective can be learned by a test agent, then efficient test automation would be possible without relying on system models or source code. We present a self-adaptive reinforcement learning-driven load testing agent, RELOAD, that learns the optimal policy for test workload generation and generates an effective workload efficiently to meet the test objective. Once the agent learns the optimal policy, it can reuse the learned policy in subsequent testing activities. Our experiments show that the proposed intelligent load test agent can accomplish the test objective with lower test cost compared to common load testing procedures, and results in higher test efficiency.

SEApr 5, 2021
Automated Performance Testing Based on Active Deep Learning

Ali Sedaghatbaf, Mahshid Helali Moghadam, Mehrdad Saadatmand

Generating tests that can reveal performance issues in large and complex software systems within a reasonable amount of time is a challenging task. On one hand, there are numerous combinations of input data values to explore. On the other hand, we have a limited test budget to execute tests. What makes this task even more difficult is the lack of access to source code and the internal details of these systems. In this paper, we present an automated test generation method called ACTA for black-box performance testing. ACTA is based on active learning, which means that it does not require a large set of historical test data to learn about the performance characteristics of the system under test. Instead, it dynamically chooses the tests to execute using uncertainty sampling. ACTA relies on a conditional variant of generative adversarial networks,and facilitates specifying performance requirements in terms of conditions and generating tests that address those conditions.We have evaluated ACTA on a benchmark web application, and the experimental results indicate that this method is comparable with random testing, and two other machine learning methods,i.e. PerfXRL and DN.

SEAug 19, 2019
An Autonomous Performance Testing Framework using Self-Adaptive Fuzzy Reinforcement Learning

Mahshid Helali Moghadam, Mehrdad Saadatmand, Markus Borg et al.

Test automation brings the potential to reduce costs and human effort, but several aspects of software testing remain challenging to automate. One such example is automated performance testing to find performance breaking points. Current approaches to tackle automated generation of performance test cases mainly involve using source code or system model analysis or use-case based techniques. However, source code and system models might not always be available at testing time. On the other hand, if the optimal performance testing policy for the intended objective in a testing process instead could be learned by the testing system, then test automation without advanced performance models could be possible. Furthermore, the learned policy could later be reused for similar software systems under test, thus leading to higher test efficiency. We propose SaFReL, a self-adaptive fuzzy reinforcement learning-based performance testing framework. SaFReL learns the optimal policy to generate performance test cases through an initial learning phase, then reuses it during a transfer learning phase, while keeping the learning running and updating the policy in the long term. Through multiple experiments on a simulated environment, we demonstrate that our approach generates the target performance test cases for different programs more efficiently than a typical testing process, and performs adaptively without access to source code and performance models.