Sebastiano Panichella

h-index3

5papers

151citations

Novelty29%

AI Score24

Ranked #172,757 of 194,257 authors (top 89%)#2,126 in SE (top 70%)

5 Papers

3.6SEApr 8, 2021Code

Do Communities in Developer Interaction Networks align with Subsystem Developer Teams? An Empirical Study of Open Source Systems

Usman Ashraf, Christoph Mayr-Dorn, Atif Mashkoor et al.

Studies over the past decade demonstrated that developers contributing to open source software systems tend to self-organize in "emerging" communities. This latent community structure has a significant impact on software quality. While several approaches address the analysis of developer interaction networks, the question of whether these emerging communities align with the developer teams working on various subsystems remains unanswered. Work on socio-technical congruence implies that people that work on the same task or artifact need to coordinate and thus communicate, potentially forming stronger interaction ties. Our empirical study of 10 open source projects revealed that developer communities change considerably across a project's lifetime (hence implying that relevant relations between developers change) and that their alignment with subsystem developer teams is mostly low. However, subsystems teams tend to remain more stable. These insights are useful for practitioners and researchers to better understand developer interaction structure of open source systems.

6.4SENov 8, 2021

Machine Learning-based Test Selection for Simulation-based Testing of Self-driving Cars Software

Sajad Khatiri, Christian Birchler, Bill Bosshard et al.

Abstract Simulation platforms facilitate the development of emerging cyber-physical systems (CPS) like self-driving cars (SDC) because they are more efficient and less dangerous than field operational tests. Despite this, thoroughly testing SDCs in simulated environments remains challenging because SDCs must be tested in a sheer amount of long-running test scenarios. Past results on software testing optimization have shown that not all the tests contribute equally to establishing confidence in test subjects' quality and reliability, with some \uninformative" tests that can be skipped (or removed) to reduce testing effort. However, this problem was partially addressed in the context of SDC simulation platforms. In this paper, we investigate test selection strategies to increase the cost-effectiveness of simulation-based testing in the context of SDCs. We propose an approach called SDC-Scissor (SDC coSt-effeCtIve teSt SelectOR), which leverages machine learning (ML) strategies to identify and skip tests that are unlikely to detect faults in SDCs before executing them. Specifically, SDC-Scissor extract features concerning the characteristics of the test scenarios being executed in the simulation environment and via ML strategies predict tests that lead to faults before executing them. Our evaluation shows that SDC-Scissor achieved high classification accuracy (up to 93.4%) in classifying tests leading to a fault which allows improving testing cost-effectiveness: SDC-Scissor was able to reduce (ca. 170%) the time spent in running irrelevant tests as well as identified 33% more failure triggering tests compared to a randomized baseline. Interestingly, SDC-Scissor does not introduce significant computational overhead in the SDCs testing process, which is critical to SDC development in industrial settings.

13.3SEAug 17, 2021

What Do Developers Discuss about Code Comments?

Pooja Rani, Mathias Birrer, Sebastiano Panichella et al.

Code comments are important for program comprehension, development, and maintenance tasks. Given the varying standards for code comments, and their unstructured or semi-structured nature, developers get easily confused (especially novice developers) about which convention(s) to follow, or what tools to use while writing code documentation. Thus, they post related questions on external online sources to seek better commenting practices. In this paper, we analyze code comment discussions on online sources such as Stack Overflow (SO) and Quora to shed some light on the questions developers ask about commenting practices. We apply Latent Dirichlet Allocation (LDA) to identify emerging topics concerning code comments. Then we manually analyze a statistically significant sample set of posts to derive a taxonomy that provides an overview of the developer questions about commenting practices. Our results highlight that on SO nearly 40% of the questions mention how to write or process comments in documentation tools and environments, and nearly 20% of the questions are about potential limitations and possibilities of documentation tools to add automatically and consistently more information in comments. On the other hand, on Quora, developer questions focus more on background information (35% of the questions) or asking opinions (16% of the questions) about code comments. We found that (i) not all aspects of comments are covered in coding style guidelines, e.g., how to add a specific type of information, (ii) developers need support in learning the syntax and format conventions to add various types of information in comments, and (iii) developers are interested in various automated strategies for comments such as detection of bad comments, or verify comment style automatically, but lack tool support to do that.

15.7SEJul 20, 2021

Single and Multi-objective Test Cases Prioritization for Self-driving Cars in Virtual Environments

Christian Birchler, Sajad Khatiri, Pouria Derakhshanfar et al.

Testing with simulation environments helps to identify critical failing scenarios for self-driving cars (SDCs). Simulation-based tests are safer than in-field operational tests and allow detecting software defects before deployment. However, these tests are very expensive and are too many to be run frequently within limited time constraints. In this paper, we investigate test case prioritization techniques to increase the ability to detect SDC regression faults with virtual tests earlier. Our approach, called SDC-Prioritizer, prioritizes virtual tests for SDCs according to static features of the roads we designed to be used within the driving scenarios. These features can be collected without running the tests, which means that they do not require past execution results. We introduce two evolutionary approaches to prioritize the test cases using diversity metrics (black-box heuristics) computed on these static features. These two approaches, called SO-SDC-Prioritizer and MO-SDC-Prioritizer, use single-objective and multi-objective genetic algorithms, respectively, to find trade-offs between executing the less expensive tests and the most diverse test cases earlier. Our empirical study conducted in the SDC domain shows that MO-SDC-Prioritizer significantly improves the ability to detect safety-critical failures at the same level of execution time compared to baselines: random and greedy-based test case orderings. Besides, our study indicates that multi-objective meta-heuristics outperform single-objective approaches when prioritizing simulation-based tests for SDCs. MO-SDC-Prioritizer prioritizes test cases with a large improvement in fault detection while its overhead (up to 0.45% of the test execution cost) is negligible.

12.0SEJun 14, 2021Code

JUGE: An Infrastructure for Benchmarking Java Unit Test Generators

Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti et al.

Researchers and practitioners have designed and implemented various automated test case generators to support effective software testing. Such generators exist for various languages (e.g., Java, C#, or Python) and for various platforms (e.g., desktop, web, or mobile applications). Such generators exhibit varying effectiveness and efficiency, depending on the testing goals they aim to satisfy (e.g., unit-testing of libraries vs. system-testing of entire applications) and the underlying techniques they implement. In this context, practitioners need to be able to compare different generators to identify the most suited one for their requirements, while researchers seek to identify future research directions. This can be achieved through the systematic execution of large-scale evaluations of different generators. However, the execution of such empirical evaluations is not trivial and requires a substantial effort to collect benchmarks, setup the evaluation infrastructure, and collect and analyse the results. In this paper, we present our JUnit Generation benchmarking infrastructure (JUGE) supporting generators (e.g., search-based, random-based, symbolic execution, etc.) seeking to automate the production of unit tests for various purposes (e.g., validation, regression testing, fault localization, etc.). The primary goal is to reduce the overall effort, ease the comparison of several generators, and enhance the knowledge transfer between academia and industry by standardizing the evaluation and comparison process. Since 2013, eight editions of a unit testing tool competition, co-located with the Search-Based Software Testing Workshop, have taken place and used and updated JUGE. As a result, an increasing amount of tools (over ten) from both academia and industry have been evaluated on JUGE, matured over the years, and allowed the identification of future research directions.