SEApr 20Code
When AI Models Become Dependencies: Studying the Evolution of Pre-Trained Model Reuse in Downstream Software SystemsPeerachai Banyongrakkul, Mansooreh Zahedi, Christoph Treude et al.
Modern software systems have transitioned from purely code-based architectures to AI-integrated systems where pre-trained models (PTMs) serve as permanent dependencies. However, while the evolution of traditional software libraries is well-documented, we lack a clear understanding of how these "PTM dependencies" change over time. Unlike libraries, PTMs are characterized by opaque internals and less standardized, rapidly evolving release cycles. Furthermore, their multi-role nature enables developers to treat individual instances of a single PTM as separate functional dependencies based on their specific downstream tasks. This raises a critical question for software maintenance: do PTMs change like standard software libraries or do they follow a divergent pattern? To answer this, we present the first empirical study of downstream PTM changes, analyzing a comprehensive dataset of 4,988 releases across 323 GitHub OSS repositories that reuse open-source PTMs. Using traditional software libraries as a baseline, we find that PTMs follow a qualitatively distinct pattern. PTMs are typically added late in the project life-cycle and tend to accumulate rather than be replaced as a project matures. Our findings show that PTM changes are three times less frequent (406 of 2,814 release transitions) than library changes. PTM changes are also less routinely documented, but more likely to carry explicit rationale. Unlike libraries, which evolve reactively, PTM evolution is proactively driven by capability expansion, with a unique documented rationale of PTM testing uncertainty. Our work calls for a rethinking of how PTMs are tracked and managed as dependencies in modern software engineering.
SEApr 30
AI Failures in the Eyes of the Downstream Developer: A First Look at Concerns, Practices, and ChallengesHaoyu Gao, Mansooreh Zahedi, Wenxin Jiang et al.
With the advancement of AI models, more software systems are adopting AI as a component to facilitate automation. Pre-trained models (PTMs) have become a cornerstone of AI-based software, allowing for rapid integration and development with lower training cost. However, their adoption also introduces failure modes such as data leakage and biased outputs, that may require careful handling by downstream developers. While previous research has proposed taxonomies of these technical concerns and various mitigation strategies, how downstream developers address these issues during the development of general AI-based software when reusing PTMs remains unexplored. Understanding downstream developers' perspectives is essential because they directly influence how these potential failures concerns translate into practice, such as determining whether immediate risks like data leakage or model bias are recognised, mitigated, or inadvertently overlooked in real-world deployments. This study investigates downstream developers' concerns, practices and perceived challenges regarding practical AI failures during the development of AI-based software. To achieve this, we conducted a mixed-method study, including interviews with 16 participants, a survey of 86 practitioners,
SEJul 16, 2025
From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream DevelopersPeerachai Banyongrakkul, Mansooreh Zahedi, Patanamon Thongtanunam et al.
Pre-trained models (PTMs) have gained widespread popularity and achieved remarkable success across various fields, driven by their groundbreaking performance and easy accessibility through hosting providers. However, the challenges faced by downstream developers in reusing PTMs in software systems are less explored. To bridge this knowledge gap, we qualitatively created and analyzed a dataset of 840 PTM-related issue reports from 31 OSS GitHub projects. We systematically developed a comprehensive taxonomy of PTM-related challenges that developers face in downstream projects. Our study identifies seven key categories of challenges that downstream developers face in reusing PTMs, such as model usage, model performance, and output quality. We also compared our findings with existing taxonomies. Additionally, we conducted a resolution time analysis and, based on statistical tests, found that PTM-related issues take significantly longer to be resolved than issues unrelated to PTMs, with significant variation across challenge categories. We discuss the implications of our findings for practitioners and possibilities for future research.
SEApr 6, 2023
SoK: Machine Learning for Continuous IntegrationAli Kazemi Arani, Mansooreh Zahedi, Triet Huynh Minh Le et al.
Continuous Integration (CI) has become a well-established software development practice for automatically and continuously integrating code changes during software development. An increasing number of Machine Learning (ML) based approaches for automation of CI phases are being reported in the literature. It is timely and relevant to provide a Systemization of Knowledge (SoK) of ML-based approaches for CI phases. This paper reports an SoK of different aspects of the use of ML for CI. Our systematic analysis also highlights the deficiencies of the existing ML-based solutions that can be improved for advancing the state-of-the-art.
SEJun 28, 2024
Systematic Literature Review on Application of Learning-based Approaches in Continuous IntegrationAli Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi et al.
Context: Machine learning (ML) and deep learning (DL) analyze raw data to extract valuable insights in specific phases. The rise of continuous practices in software projects emphasizes automating Continuous Integration (CI) with these learning-based methods, while the growing adoption of such approaches underscores the need for systematizing knowledge. Objective: Our objective is to comprehensively review and analyze existing literature concerning learning-based methods within the CI domain. We endeavour to identify and analyse various techniques documented in the literature, emphasizing the fundamental attributes of training phases within learning-based solutions in the context of CI. Method: We conducted a Systematic Literature Review (SLR) involving 52 primary studies. Through statistical and thematic analyses, we explored the correlations between CI tasks and the training phases of learning-based methodologies across the selected studies, encompassing a spectrum from data engineering techniques to evaluation metrics. Results: This paper presents an analysis of the automation of CI tasks utilizing learning-based methods. We identify and analyze nine types of data sources, four steps in data preparation, four feature types, nine subsets of data features, five approaches for hyperparameter selection and tuning, and fifteen evaluation metrics. Furthermore, we discuss the latest techniques employed, existing gaps in CI task automation, and the characteristics of the utilized learning-based techniques. Conclusion: This study provides a comprehensive overview of learning-based methods in CI, offering valuable insights for researchers and practitioners developing CI task automation. It also highlights the need for further research to advance these methods in CI.
SEJan 16, 2024
Fairness Concerns in App Reviews: A Study on AI-based Mobile AppsAli Rezaei Nasab, Maedeh Dashti, Mojtaba Shahin et al.
Fairness is one of the socio-technical concerns that must be addressed in software systems. Considering the popularity of mobile software applications (apps) among a wide range of individuals worldwide, mobile apps with unfair behaviors and outcomes can affect a significant proportion of the global population, potentially more than any other type of software system. Users express a wide range of socio-technical concerns in mobile app reviews. This research aims to investigate fairness concerns raised in mobile app reviews. Our research focuses on AI-based mobile app reviews as the chance of unfair behaviors and outcomes in AI-based mobile apps may be higher than in non-AI-based apps. To this end, we first manually constructed a ground-truth dataset, including 1,132 fairness and 1,473 non-fairness reviews. Leveraging the ground-truth dataset, we developed and evaluated a set of machine learning and deep learning models that distinguish fairness reviews from non-fairness reviews. Our experiments show that our best-performing model can detect fairness reviews with a precision of 94%. We then applied the best-performing model on approximately 9.5M reviews collected from 108 AI-based apps and identified around 92K fairness reviews. Next, applying the K-means clustering technique to the 92K fairness reviews, followed by manual analysis, led to the identification of six distinct types of fairness concerns (e.g., 'receiving different quality of features and services in different platforms and devices' and 'lack of transparency and fairness in dealing with user-generated content'). Finally, the manual analysis of 2,248 app owners' responses to the fairness reviews identified six root causes (e.g., 'copyright issues') that app owners report to justify fairness concerns.
SEMay 22, 2023
Systematic Literature Review on Application of Machine Learning in Continuous IntegrationAli Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi et al.
This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in ML-based solutions for CI and analyzed various aspects such as data engineering, feature engineering, hyper-parameter tuning, ML models, evaluation methods, and metrics. In this paper, we have depicted the phases of CI testing, the connection between them, and the employed techniques in training the ML method phases. We presented nine types of data sources and four taken steps in the selected studies for preparing the data. Also, we identified four feature types and nine subsets of data features through thematic analysis of the selected studies. Besides, five methods for selecting and tuning the hyper-parameters are shown. In addition, we summarised the evaluation methods used in the literature and identified fifteen different metrics. The most commonly used evaluation methods were found to be precision, recall, and F1-score, and we have also identified five methods for evaluating the performance of trained ML models. Finally, we have presented the relationship between ML model types, performance measurements, and CI phases. The study provides valuable insights for researchers and practitioners interested in ML-based methods in CI and emphasizes the need for further research in this area.
SEFeb 18, 2022
Why, How and Where of Delays in Software Security Patch Management: An Empirical Investigation in the Healthcare SectorNesara Dissanayake, Mansooreh Zahedi, Asangi Jayatilaka et al.
Numerous security attacks that resulted in devastating consequences can be traced back to a delay in applying a security patch. Despite the criticality of timely patch application, not much is known about why and how delays occur when applying security patches in practice, and how the delays can be mitigated. Based on longitudinal data collected from 132 delayed patching tasks over a period of four years and observations of patch meetings involving eight teams from two organisations in the healthcare domain, and using quantitative and qualitative data analysis approaches, we identify a set of reasons relating to technology, people and organisation as key explanations that cause delays in patching. Our findings also reveal that the most prominent cause of delays is attributable to coordination delays in the patch management process and a majority of delays occur during the patch deployment phase. Towards mitigating the delays, we describe a set of strategies employed by the studied practitioners. This research serves as the first step towards understanding the practical reasons for delays and possible mitigation strategies in vulnerability patch management. Our findings provide useful insights for practitioners to understand what and where improvement is needed in the patch management process and guide them towards taking timely actions against potential attacks. Also, our findings help researchers to invest effort into designing and developing computer-supported tools to better support a timely security patch management process.
CRDec 20, 2021
Systematic Literature Review on Cyber Situational Awareness VisualizationsLiuyue Jiang, Asangi Jayatilaka, Mehwish Nasim et al.
The dynamics of cyber threats are increasingly complex, making it more challenging than ever for organizations to obtain in-depth insights into their cyber security status. Therefore, organizations rely on Cyber Situational Awareness (CSA) to support them in better understanding the threats and associated impacts of cyber events. Due to the heterogeneity and complexity of cyber security data, often with multidimensional attributes, sophisticated visualization techniques are needed to achieve CSA. However, there have been no previous attempts to systematically review and analyze the scientific literature on CSA visualizations. In this paper, we systematically select and review 54 publications that discuss visualizations to support CSA. We extract data from these papers to identify key stakeholders, information types, data sources, and visualization techniques. Furthermore, we analyze the level of CSA supported by the visualizations, alongside examining the maturity of the visualizations, challenges, and practices related to CSA visualizations to prepare a full analysis of the current state of CSA in an organizational context. Our results reveal certain gaps in CSA visualizations. For instance, the largest focus is on operational-level staff, and there is a clear lack of visualizations targeting other types of stakeholders such as managers, higher-level decision makers, and non-expert users. Most papers focus on threat information visualization, and there is a dearth of papers that visualize impact information, response plans, and information shared within teams. Based on the results that highlight the important concerns in CSA visualizations, we recommend a list of future research directions.
CRDec 12, 2021
Evaluation of Security Training and Awareness Programs: Review of Current Practices and GuidelineAsangi Jayatilaka, Nathan Beu, Irina Baetu et al.
Evaluating the effectiveness of security awareness and training programs is critical for minimizing organizations' human security risk. Based on a literature review and industry interviews, we discuss current practices and devise guidelines for measuring the effectiveness of security training and awareness initiatives used by organizations
SEJul 29, 2021
An Empirical Study of Developers' Discussions about Security Challenges of Different Programming LanguagesRoland Croft, Yongzheng Xie, Mansooreh Zahedi et al.
Given programming languages can provide different types and levels of security support, it is critically important to consider security aspects while selecting programming languages for developing software systems. Inadequate consideration of security in the choice of a programming language may lead to potential ramifications for secure development. Whilst theoretical analysis of the supposed security properties of different programming languages has been conducted, there has been relatively little effort to empirically explore the actual security challenges experienced by developers. We have performed a large-scale study of the security challenges of 15 programming languages by quantitatively and qualitatively analysing the developers' discussions from Stack Overflow and GitHub. By leveraging topic modelling, we have derived a taxonomy of 18 major security challenges for 6 topic categories. We have also conducted comparative analysis to understand how the identified challenges vary regarding the different programming languages and data sources. Our findings suggest that the challenges and their characteristics differ substantially for different programming languages and data sources, i.e., Stack Overflow and GitHub. The findings provide evidence-based insights and understanding of security challenges related to different programming languages to software professionals (i.e., practitioners or researchers). The reported taxonomy of security challenges can assist both practitioners and researchers in better understanding and traversing the secure development landscape. This study highlights the importance of the choice of technology, e.g., programming language, in secure software engineering. Hence, the findings are expected to motivate practitioners to consider the potential impact of the choice of programming languages on software security.
CRJul 5, 2021
An Empirical Analysis of Practitioners' Perspectives on Security Tool Integration into DevOpsRoshan Namal Rajapakse, Mansooreh Zahedi, Muhammad Ali Babar
Background: Security tools play a vital role in enabling developers to build secure software. However, it can be quite challenging to introduce and fully leverage security tools without affecting the speed or frequency of deployments in the DevOps paradigm. Aims: We aim to empirically investigate the key challenges practitioners face when integrating security tools into a DevOps workflow in order to provide recommendations to overcome them. Method: We conducted a study involving 31 systematically selected webinars on integrating security tools in DevOps. We used a qualitative data analysis method, i.e., thematic analysis, to identify the challenges and emerging solutions related to integrating security tools in rapid deployment environments. Results: We find that while traditional security tools are unable to cater for the needs of DevOps, the industry is moving towards new generations of tools that have started focusing on these requirements. We have developed a DevOps workflow that integrates security tools and a set of guidelines by synthesizing practitioners' recommendations in the analyzed webinars. Conclusion: While the latest security tools are addressing some of the requirements of DevOps, there are many tool-related drawbacks yet to be adequately addressed.
SEJun 7, 2021
A Grounded Theory of the Role of Coordination in Software Security Patch ManagementNesara Dissanayake, Mansooreh Zahedi, Asangi Jayatilaka et al.
Several disastrous security attacks can be attributed to delays in patching software vulnerabilities. While researchers and practitioners have paid significant attention to automate vulnerabilities identification and patch development activities of software security patch management, there has been relatively little effort dedicated to gain an in-depth understanding of the socio-technical aspects, e.g., coordination of interdependent activities of the patching process and patching decisions, that may cause delays in applying security patches. We report on a Grounded Theory study of the role of coordination in security patch management. The reported theory consists of four inter-related dimensions, i.e., causes, breakdowns, constraints, and mechanisms. The theory explains the causes that define the need for coordination among interdependent software and hardware components and multiple stakeholders' decisions, the constraints that can negatively impact coordination, the breakdowns in coordination, and the potential corrective measures. This study provides potentially useful insights for researchers and practitioners who can carefully consider the needs of and devise suitable solutions for supporting the coordination of interdependencies involved in security patch management.
SEMar 15, 2021
Challenges and solutions when adopting DevSecOps: A systematic reviewRoshan N. Rajapakse, Mansooreh Zahedi, M. Ali Babar et al.
Context: DevOps has become one of the fastest-growing software development paradigms in the industry. However, this trend has presented the challenge of ensuring secure software delivery while maintaining the agility of DevOps. The efforts to integrate security in DevOps have resulted in the DevSecOps paradigm, which is gaining significant interest from both industry and academia. However, the adoption of DevSecOps in practice is proving to be a challenge. Objective: This study aims to systemize the knowledge about the challenges faced by practitioners when adopting DevSecOps and the proposed solutions reported in the literature. We also aim to identify the areas that need further research in the future. Method: We conducted a Systematic Literature Review of 54 peer-reviewed studies. The thematic analysis method was applied to analyze the extracted data. Results: We identified 21 challenges related to adopting DevSecOps, 31 specific solutions, and the mapping between these findings. We also determined key gap areas in this domain by holistically evaluating the available solutions against the challenges. The results of the study were classified into four themes: People, Practices, Tools, and Infrastructure. Our findings demonstrate that tool-related challenges and solutions were the most frequently reported, driven by the need for automation in this paradigm. Shift-left security and continuous security assessment were two key practices recommended for DevSecOps. Conclusions: We highlight the need for developer-centered application security testing tools that target the continuous practices in DevSecOps. More research is needed on how the traditionally manual security practices can be automated to suit rapid software deployment cycles. Finally, achieving a suitable balance between the speed of delivery and security is a significant issue practitioners face in the DevSecOps paradigm.
CRJan 25, 2021
End-Users' Knowledge and Perception about Security of Mobile Health Apps: A Case Study with Two Saudi Arabian mHealth ProvidersBakheet Aljedaani, Aakash Ahmad, Mansooreh Zahedi et al.
Mobile health applications (mHealth apps for short) are being increasingly adopted in the healthcare sector, enabling stakeholders such as governments, health units, medics, and patients, to utilize health services in a pervasive manner. Despite having several known benefits, mHealth apps entail significant security and privacy challenges that can lead to data breaches with serious social, legal, and financial consequences. This research presents an empirical investigation about security awareness of end-users of mHealth apps that are available on major mobile platforms, including Android and iOS. We collaborated with two mHealth providers in Saudi Arabia to survey 101 end-users, investigating their security awareness about (i) existing and desired security features, (ii) security related issues, and (iii) methods to improve security knowledge. Findings indicate that majority of the end-users are aware of the existing security features provided by the apps (e.g., restricted app permissions); however, they desire usable security (e.g., biometric authentication) and are concerned about privacy of their health information (e.g., data anonymization). End-users suggested that protocols such as session timeout or Two-factor authentication (2FA) positively impact security but compromise usability of the app. Security-awareness via social media, peer guidance, or training from app providers can increase end-users trust in mHealth apps. This research investigates human-centric knowledge based on empirical evidence and provides a set of guidelines to develop secure and usable mHealth apps.
SEDec 1, 2020
Software Security Patch Management -- A Systematic Literature Review of Challenges, Approaches, Tools and PracticesNesara Dissanayake, Asangi Jayatilaka, Mansooreh Zahedi et al.
Context: Software security patch management purports to support the process of patching known software security vulnerabilities. Given the increasing recognition of the importance of software security patch management, it is important and timely to systematically review and synthesise the relevant literature on this topic. Objective: This paper aims at systematically reviewing the state of the art of software security patch management to identify the socio-technical challenges in this regard, reported solutions (i.e., approaches, tools, and practices), the rigour of the evaluation and the industrial relevance of the reported solutions, and to identify the gaps for future research. Method: We conducted a systematic literature review of 72 studies published from 2002 to March 2020, with extended coverage until September 2020 through forward snowballing. Results: We identify 14 socio-technical challenges, 18 solution approaches, tools and practices mapped onto the software security patch management process. We provide a mapping between the solutions and challenges to enable a reader to obtain a holistic overview of the gap areas. The findings also reveal that only 20.8% of the reported solutions have been rigorously evaluated in industrial settings. Conclusion: Our results reveal that 50% of the common challenges have not been directly addressed in the solutions and that most of them (38.9%) address the challenges in one phase of the process, namely vulnerability scanning, assessment and prioritisation. Based on the results that highlight the important concerns in software security patch management and the lack of solutions, we recommend a list of future research directions. This study also provides useful insights about different opportunities for practitioners to adopt new solutions and understand the variations of their practical utility.
SEAug 29, 2020
Security Awareness of End-Users of Mobile Health Applications: An Empirical StudyBakheet Aljedaani, Aakash Ahmad, Mansooreh Zahedi et al.
Mobile systems offer portable and interactive computing, empowering users, to exploit a multitude of context-sensitive services, including mobile healthcare. Mobile health applications (i.e., mHealth apps) are revolutionizing the healthcare sector by enabling stakeholders to produce and consume healthcare services. A widespread adoption of mHealth technologies and rapid increase in mHealth apps entail a critical challenge, i.e., lack of security awareness by end-users regarding health-critical data. This paper presents an empirical study aimed at exploring the security awareness of end-users of mHealth apps. We collaborated with two mHealth providers in Saudi Arabia to gather data from 101 end-users. The results reveal that despite having the required knowledge, end-users lack appropriate behaviour , i.e., reluctance or lack of understanding to adopt security practices, compromising health-critical data with social, legal, and financial consequences. The results emphasize that mHealth providers should ensure security training of end-users (e.g., threat analysis workshops), promote best practices to enforce security (e.g., multi-step authentication), and adopt suitable mHealth apps (e.g., trade-offs for security vs usability). The study provides empirical evidence and a set of guidelines about security awareness of mHealth apps.
SEAug 7, 2020
An Empirical Study on Developing Secure Mobile Health Apps: The Developers PerspectiveBakheet Aljedaani, Aakash Ahmad, Mansooreh Zahedi et al.
Mobile apps exploit embedded sensors and wireless connectivity of a device to empower users with portable computations, context-aware communication, and enhanced interaction. Specifically, mobile health apps (mHealth apps for short) are becoming integral part of mobile and pervasive computing to improve the availability and quality of healthcare services. Despite the offered benefits, mHealth apps face a critical challenge, i.e., security of health critical data that is produced and consumed by the app. Several studies have revealed that security specific issues of mHealth apps have not been adequately addressed. The objectives of this study are to empirically (a) investigate the challenges that hinder development of secure mHealth apps, (b) identify practices to develop secure apps, and (c) explore motivating factors that influence secure development. We conducted this study by collecting responses of 97 developers from 25 countries, across 06 continents, working in diverse teams and roles to develop mHealth apps for Android, iOS, and Windows platform. Qualitative analysis of the survey data is based on (i) 8 critical challenges, (ii) taxonomy of best practices to ensure security, and (iii) 6 motivating factors that impact secure mHealth apps. This research provides empirical evidence as practitioners view and guidelines to develop emerging and next generation of secure mHealth apps.
SEAug 27, 2018
An Empirical Study of Architecting for Continuous Delivery and DeploymentMojtaba Shahin, Mansooreh Zahedi, Muhammad Ali Babar et al.
Recently, many software organizations have been adopting Continuous Delivery and Continuous Deployment (CD) practices to develop and deliver quality software more frequently and reliably. Whilst an increasing amount of the literature covers different aspects of CD, little is known about the role of software architecture in CD and how an application should be (re-) architected to enable and support CD. We have conducted a mixed-methods empirical study that collected data through in-depth, semi-structured interviews with 21 industrial practitioners from 19 organizations, and a survey of 91 professional software practitioners. Based on a systematic and rigorous analysis of the gathered qualitative and quantitative data, we present a conceptual framework to support the process of (re-) architecting for CD. We provide evidence-based insights about practicing CD within monolithic systems and characterize the principle of "small and independent deployment units" as an alternative to the monoliths. Our framework supplements the architecting process in a CD context through introducing the quality attributes (e.g., resilience) that require more attention and demonstrating the strategies (e.g., prioritizing operations concerns) to design operations-friendly architectures. We discuss the key insights (e.g., monoliths and CD are not intrinsically oxymoronic) gained from our study and draw implications for research and practice.
SEMar 13, 2017
Security Support in Continuous Deployment PipelineFaheem Ullah, Adam Johannes Raft, Mojtaba Shahin et al.
Continuous Deployment (CD) has emerged as a new practice in the software industry to continuously and automatically deploy software changes into production. Continuous Deployment Pipeline (CDP) supports CD practice by transferring the changes from the repository to production. Since most of the CDP components run in an environment that has several interfaces to the Internet, these components are vulnerable to various kinds of malicious attacks. This paper reports our work aimed at designing secure CDP by utilizing security tactics. We have demonstrated the effectiveness of five security tactics in designing a secure pipeline by conducting an experiment on two CDPs - one incorporates security tactics while the other does not. Both CDPs have been analyzed qualitatively and quantitatively. We used assurance cases with goal-structured notations for qualitative analysis. For quantitative analysis, we used penetration tools. Our findings indicate that the applied tactics improve the security of the major components (i.e., repository, continuous integration server, main server) of a CDP by controlling access to the components and establishing secure connections.