15.9SEApr 22
Evaluating Software Defect Prediction Models via the Area Under the ROC Curve Can Be MisleadingLuigi Lavazza, Gabriele Rotoloni, Sandro Morasca
Background: Receiver Operating Characteristic (ROC) curves are widely used to evaluate the performance of Software Defect Prediction (SDP) models that estimate module fault-proneness, i.e., the probability that a module is faulty. A ROC curve maps a model's performance in terms of True Positive Rate and False Positive Rate for any possible threshold set on fault-proneness. The Area Under the ROC Curve (AUC) summarizes the performance of a model across all possible thresholds. Traditionally, ROC curves completely above the bisector of the ROC space are considered better than random, and high AUC values are associated with good performance. Aim: We investigate whether these beliefs are correct, hence if SDP model evaluation based on ROC curves and AUC is reliable. Method: We decorate ROC curves by highlighting the points corresponding to threshold values. We also represent True Positive Rate and False Positive Rate as functions of the threshold. Thus, we can evaluate whether a model classifies both faulty and non-faulty modules better than the random model. Results: We show that commonly used evaluation criteria may lead to wrong conclusions. Conclusions: A high value of AUC does not guarantee that both the True Positive Rate and the False Positive Rate of a model are better than the random model's for all possible thresholds. Either decorated ROC curves or alternative representations are needed to appreciate all the relevant aspects of SDP models.
SEMar 16, 2021
Understanding and Modeling AI-Intensive System DevelopmentLuigi Lavazza, Sandro Morasca
Developers of AI-Intensive Systems--i.e., systems that involve both "traditional" software and Artificial Intelligence"are recognizing the need to organize development systematically and use engineered methods and tools. Since an AI-Intensive System (AIIS) relies heavily on software, it is expected that Software Engineering (SE) methods and tools can help. However, AIIS development differs from the development of "traditional" software systems in a few substantial aspects. Hence, traditional SE methods and tools are not suitable or sufficient by themselves and need to be adapted and extended. A quest for "SE for AI" methods and tools has started. We believe that, in this effort, we should learn from experience and avoid repeating some of the mistakes made in the quest for SE in past years. To this end, a fundamental instrument is a set of concepts and a notation to deal with AIIS and the problems that characterize their development processes. In this paper, we propose to describe AIIS via a notation that was proposed for SE and embeds a set of concepts that are suitable to represent AIIS as well. We demonstrate the usage of the notation by modeling some characteristics that are particularly relevant for AIIS.
SEJan 7, 2021
Toward Inclusion of Children as Software Engineering StakeholdersLetizia Jaccheri, Sandro Morasca
Background: A growing amount of software is available to children today. Children use both software that has been explicitly developed for them and software for general users. While they obtain clear benefits from software, such as access to creativity tools and learning resources, children are also exposed to several risks and disadvantages, such as privacy violation, inactivity, or safety risks that can even lead to death. The research and development community is addressing and investigating positive and negative impacts of software for children one by one, but no comprehensive model exists that relates software engineering and children as stakeholders in their own right. Aims: The final objective of this line of research is to propose effective ways in which children can be involved in Software Engineering activities as stakeholders. Specifically, in this paper, we investigate the quality aspects that are of interest for children, as quality is a crucial aspect in the development of any kind of software, especially for stakeholders like children. Method: Our contribution is based mainly on an analysis of studies at the intersection between Software Engineering (especially software quality) and Child Computer Interaction. Results: We identify a set of qualities and a preliminary set of guidelines that can be used by researchers and practitioners in understanding the complex interrelations between Software Engineering and children. Based on the qualities and the guidelines, researchers can design empirical investigations to obtain deeper insights into the phenomenon and propose new Software Engineering knowledge specific for this type of stakeholders. Conclusions: This conceptualization is a first step towards a framework to support children as stakeholders in software engineering.