Mehrdad Sabetzadeh

SE
h-index37
11papers
217citations
Novelty44%
AI Score47

11 Papers

SEMar 16, 2023Code
Measuring Improvement of F$_1$-Scores in Detection of Self-Admitted Technical Debt

William Aiken, Paul K. Mvula, Paula Branco et al.

Artificial Intelligence and Machine Learning have witnessed rapid, significant improvements in Natural Language Processing (NLP) tasks. Utilizing Deep Learning, researchers have taken advantage of repository comments in Software Engineering to produce accurate methods for detecting Self-Admitted Technical Debt (SATD) from 20 open-source Java projects' code. In this work, we improve SATD detection with a novel approach that leverages the Bidirectional Encoder Representations from Transformers (BERT) architecture. For comparison, we re-evaluated previous deep learning methods and applied stratified 10-fold cross-validation to report reliable F$_1$-scores. We examine our model in both cross-project and intra-project contexts. For each context, we use re-sampling and duplication as augmentation strategies to account for data imbalance. We find that our trained BERT model improves over the best performance of all previous methods in 19 of the 20 projects in cross-project scenarios. However, the data augmentation techniques were not sufficient to overcome the lack of data present in the intra-project scenarios, and existing methods still perform better. Future research will look into ways to diversify SATD datasets in order to maximize the latent power in large BERT models.

43.2SEMay 2
Genetic Programming for Self-Adaptive Auto-Scaling of Microservices

Jia Li, Mehrdad Sabetzadeh, Shiva Nejati

Microservice architecture is widely adopted in modern systems, where auto-scaling is critical for satisfying service-level objectives (SLOs). However, determining optimal scaling for microservices is difficult, and reactive resource allocation often leads to costly over- or under-provisioning. We propose AutoSLO, a learning-based, self-adaptive scaling framework that dynamically adjusts microservice replicas to meet SLOs while minimizing resource usage. AutoSLO uses a continuous monitoring-adaptation feedback loop and leverages genetic programming to learn and evolve scaling logic, enabling the deployed microservice system to proactively prevent SLO violations rather than repeatedly searching for one-off scaling actions. We evaluate AutoSLO on two case-study systems -- an online shopping platform and a chatbot based on large language models -- and show that this framework substantially reduces resource usage while maintaining a low frequency of SLO violations, all of which are resolved within a short time window.

SEJun 21, 2022
TAPHSIR: Towards AnaPHoric Ambiguity Detection and ReSolution In Requirements

Saad Ezzini, Sallam Abualhaija, Chetan Arora et al.

We introduce TAPHSIR, a tool for anaphoric ambiguity detection and anaphora resolution in requirements. TAPHSIR facilities reviewing the use of pronouns in a requirements specification and revising those pronouns that can lead to misunderstandings during the development process. To this end, TAPHSIR detects the requirements which have potential anaphoric ambiguity and further attempts interpreting anaphora occurrences automatically. TAPHSIR employs a hybrid solution composed of an ambiguity detection solution based on machine learning and an anaphora resolution solution based on a variant of the BERT language model. Given a requirements specification, TAPHSIR decides for each pronoun occurrence in the specification whether the pronoun is ambiguous or unambiguous, and further provides an automatic interpretation for the pronoun. The output generated by TAPHSIR can be easily reviewed and validated by requirements engineers. TAPHSIR is publicly available on Zenodo (DOI: 10.5281/zenodo.5902117).

29.0SEApr 16
Automated Test Validators for Flaky Cyber-Physical System Simulators: Approach and Evaluation

Baharin A. Jodat, Khouloud Gaaloul, Mehrdad Sabetzadeh et al.

Simulation-based testing of cyber-physical systems (CPS) is costly due to the time-consuming execution of CPS simulators. In addition, CPS simulators may be flaky, leading to inconsistent test outcomes and requiring repeated test re-execution for reliable test verdicts. Many test inputs within the input space of CPS may not effectively exercise the behaviour of the system under test (SUT) -- for instance, those that violate system preconditions, exceed operational design domain (ODD) limits, or represent inherently safe scenarios. In this article, we propose to use test validators to filter out such test inputs before execution. We describe two methods for generating test validators: one using genetic programming (GP) that employs well-known spectrum-based fault localization (SBFL) ranking formulas, namely Ochiai, Tarantula, and Naish, as fitness functions; and the other using decision trees (DT) and decision rules (DR). We evaluate our test validators through case studies in the domains of aerospace, networking and autonomous driving. We show that test validators generated using GP with Ochiai are significantly more accurate than those generated using GP with Tarantula and Naish or using DT or DR. Moreover, this accuracy advantage remains even when accounting for the flakiness of the simulator. We further show that our test validators generated by GP with Ochiai are robust against flakiness with only 4% average variation in their accuracy results across four different network and autonomous-driving systems with flaky behaviours. Finally, we show that, on average, 88.7% of the assertions inferred by our approach align or overlap with requirements precondition violations, ODD-limit violations, and nominal safe conditions extracted from technical standards and empirical results in the literature.

CVJul 7, 2025
Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach

Mohammad Hossein Amini, Mehrdad Sabetzadeh, Shiva Nejati

Software systems increasingly include AI components based on deep learning (DL). Reliable testing of such systems requires near-perfect test-input validity and label accuracy, with minimal human effort. Yet, the DL community has largely overlooked the need to build highly accurate datasets with minimal effort, since DL training is generally tolerant of labelling errors. This challenge, instead, reflects concerns more familiar to software engineering, where a central goal is to construct high-accuracy test inputs, with accuracy as close to 100% as possible, while keeping associated costs in check. In this article we introduce OPAL, a human-assisted labelling method that can be configured to target a desired accuracy level while minimizing the manual effort required for labelling. The main contribution of OPAL is a mixed-integer linear programming (MILP) formulation that minimizes labelling effort subject to a specified accuracy target. To evaluate OPAL we instantiate it for two tasks in the context of testing vision systems: automatic labelling of test inputs and automated validation of test inputs. Our evaluation, based on more than 2500 experiments performed on seven datasets, comparing OPAL with eight baseline methods, shows that OPAL, relying on its MILP formulation, achieves an average accuracy of 98.8%, while cutting manual labelling by more than half. OPAL significantly outperforms automated labelling baselines in labelling accuracy across all seven datasets, when all methods are provided with the same manual-labelling budget. For automated test-input validation, on average, OPAL reduces manual effort by 28.8% while achieving 4.5% higher accuracy than the SOTA test-input validation baselines. Finally, we show that augmenting OPAL with an active-learning loop leads to an additional 4.5% reduction in required manual labelling, without compromising accuracy.

CRJun 10, 2021
AI-enabled Automation for Completeness Checking of Privacy Policies

Orlando Amaral, Sallam Abualhaija, Damiano Torre et al.

Technological advances in information sharing have raised concerns about data protection. Privacy policies contain privacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g., a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). A prerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisions of GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related software specifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automation for the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterize the privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop an automated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machine learning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against the completeness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseen privacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23 false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword search only, our approach results in an improvement of 24.5% in precision and 38% in recall.

SEJul 23, 2020
Model Driven Engineering for Data Protection and Privacy: Application and Experience with GDPR

Damiano Torre, Mauricio Alferez, Ghanem Soltana et al.

In Europe and indeed worldwide, the General Data Protection Regulation (GDPR) provides protection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly beneficial to individuals, it presents significant challenges for organizations monitoring or storing personal information. Since there is currently no automated solution with broad industrial applicability, organizations have no choice but to carry out expensive manual audits to ensure GDPR compliance. In this paper, we present a complete GDPR UML model as a first step towards designing automated methods for checking GDPR compliance. Given that the practical application of the GDPR is influenced by national laws of the EU Member States, we suggest a two-tiered description of the GDPR, generic and specialized. In this paper, we provide (1) the GDPR conceptual model we developed with complete traceability from its classes to the GDPR, (2) a glossary to help understand the model, (3) the plain-English description of 35 compliance rules derived from GDPR along with their encoding in OCL, and (4) the set of 20 variations points derived from GDPR to specialize the generic model. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research.

SEMay 4, 2020
On Systematically Building a Controlled Natural Language for Functional Requirements

Alvaro Veizaga, Mauricio Alferez, Damiano Torre et al.

[Context] Natural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness, ambiguity, and incompleteness. Controlled natural languages (CNLs) have been proposed as a way to prevent quality problems in requirements documents, while maintaining the flexibility to write and communicate requirements in an intuitive and universally understood manner. [Objective] In collaboration with an industrial partner from the financial domain, we systematically develop and evaluate a CNL, named Rimay, intended at helping analysts write functional requirements. [Method] We rely on Grounded Theory for building Rimay and follow well-known guidelines for conducting and reporting industrial case study research. [Results] Our main contributions are: (1) a qualitative methodology to systematically define a CNL for functional requirements; this methodology is general and applicable to information systems beyond the financial domain, (2) a CNL grammar to represent functional requirements; this grammar is derived from our experience in the financial domain, but should be applicable, possibly with adaptations, to other information-system domains, and (3) an empirical evaluation of our CNL (Rimay) through an industrial case study. Our contributions draw on 15 representative SRSs, collectively containing 3215 NL requirements statements from the financial domain. [Conclusion] Our evaluation shows that Rimay is expressive enough to capture, on average, 88% (405 out of 460) of the NL requirements statements in four previously unseen SRSs from the financial domain.

SEJan 30, 2020
An Automated Framework for the Extraction of Semantic Legal Metadata from Legal Texts

Amin Sleimi, Nicolas Sannier, Mehrdad Sabetzadeh et al.

Semantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements. However, manually enhancing a large legal corpus with semantic metadata is prohibitively expensive. Our work is motivated by two observations: (1) the existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis; (2) automated support for the extraction of semantic legal metadata is scarce, and it does not exploit the full potential of artificial intelligence technologies, notably natural language processing (NLP) and machine learning (ML). Our objective is to take steps toward overcoming these limitations. To do so, we review and reconcile the semantic legal metadata types proposed in the RE literature. Subsequently, we devise an automated extraction approach for the identified metadata types using NLP and ML. We evaluate our approach through two case studies over the Luxembourgish legislation. Our results indicate a high accuracy in the generation of metadata annotations. In particular, in the two case studies, we were able to obtain precision scores of 97.2% and 82.4% and recall scores of 94.9% and 92.4%.

SEMay 29, 2019
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-based Approach

Seung Yeob Shin, Shiva Nejati, Mehrdad Sabetzadeh et al.

The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a communication network for transmitting large volumes of data in unpredictable and changing environments. These networks are prone to congestion when there is a burst in demand, e.g., as an emergency situation is unfolding, and therefore rely on configurable software-defined networks (SDN). In this paper, we propose a dynamic adaptive SDN configuration approach for IoT systems. The approach enables resolving congestion in real time while minimizing network utilization, data transmission delays and adaptation costs. Our approach builds on existing work in dynamic adaptive search-based software engineering (SBSE) to reconfigure an SDN while simultaneously ensuring multiple quality of service criteria. We evaluate our approach on an industrial national emergency management system, which is aimed at detecting disasters and emergencies, and facilitating recovery and rescue operations by providing first responders with a reliable communication infrastructure. Our results indicate that (1) our approach is able to efficiently and effectively adapt an SDN to dynamically resolve congestion, and (2) compared to two baseline data forwarding algorithms that are static and non-adaptive, our approach increases data transmission rate by a factor of at least 3 and decreases data loss by at least 70%.

SEFeb 1, 2019
Practical Constraint Solving for Generating System Test Data

Ghanem Soltana, Mehrdad Sabetzadeh, Lionel C. Briand

The ability to generate test data is often a necessary prerequisite for automated software testing. For the generated data to be fit for its intended purpose, the data usually has to satisfy various logical constraints. When testing is performed at a system level, these constraints tend to be complex and are typically captured in expressive formalisms based on first-order logic. Motivated by improving the feasibility and scalability of data generation for system testing, we present a novel approach, whereby we employ a combination of metaheuristic search and Satisfiability Modulo Theories (SMT) for constraint solving. Our approach delegates constraint solving tasks to metaheuristic search and SMT in such a way as to take advantage of the complementary strengths of the two techniques. We ground our work on test data models specified in UML, with OCL used as the constraint language. We present tool support and an evaluation of our approach over three industrial case studies. The results indicate that, for complex system test data generation problems, our approach presents substantial benefits over the state of the art in terms of applicability and scalability.