32.4SEMay 12Code
An Extensive Replication Study of the ABLoTS Approach for Bug LocalizationFeifei Niu, Enshuo Zhang, Christoph Mayr-Dorn et al.
Bug localization is the task of recommending source code locations (typically files) that contain the cause of a bug and hence need to be changed to fix the bug. Along these lines, information retrieval-based bug localization (IRBL) approaches have been adopted, which identify the most bug-prone files from the source code space. In current practice, a series of state-of-the-art IRBL techniques leverage the combination of different components (e.g., similar reports, version history, and code structure) to achieve better performance. ABLoTS is a recently proposed approach with the core component, TraceScore, that utilizes requirements and traceability information between different issue reports (i.e., feature requests and bug reports) to identify buggy source code snippets with promising results. To evaluate the accuracy of these results and obtain additional insights into the practical applicability of ABLoTS, we conducted a replication study of this approach with the original dataset and also on two extended datasets (i.e., additional Java dataset and Python dataset). The original dataset consists of 11 open source Java projects with 8,494 bug reports. The extended Java dataset includes 16 more projects comprising 25,893 bug reports and corresponding source code commits. The extended Python dataset consists of 12 projects with 1,289 bug reports. While we find that the TraceScore component, which is the core of ABLoTS, produces comparable or even better results with the extended datasets, we also find that we cannot reproduce the ABLoTS results, as reported in its original paper, due to an overlooked side effect of incorrectly choosing a cut-off date that led to test data leaking into training data with significant effects on performance.
SEApr 8, 2021Code
Do Communities in Developer Interaction Networks align with Subsystem Developer Teams? An Empirical Study of Open Source SystemsUsman Ashraf, Christoph Mayr-Dorn, Atif Mashkoor et al.
Studies over the past decade demonstrated that developers contributing to open source software systems tend to self-organize in "emerging" communities. This latent community structure has a significant impact on software quality. While several approaches address the analysis of developer interaction networks, the question of whether these emerging communities align with the developer teams working on various subsystems remains unanswered. Work on socio-technical congruence implies that people that work on the same task or artifact need to coordinate and thus communicate, potentially forming stronger interaction ties. Our empirical study of 10 open source projects revealed that developer communities change considerably across a project's lifetime (hence implying that relevant relations between developers change) and that their alignment with subsystem developer teams is mostly low. However, subsystems teams tend to remain more stable. These insights are useful for practitioners and researchers to better understand developer interaction structure of open source systems.
SEMar 3, 2021
TaskAllocator: A Recommendation Approach for Role-based Tasks Allocation in Agile Software DevelopmentSaad Shafiq, Atif Mashkoor, Christoph Mayr-Dorn et al.
In this paper, we propose a recommendation approach -- TaskAllocator -- in order to predict the assignment of incoming tasks to potential befitting roles. The proposed approach, identifying team roles rather than individual persons, allows project managers to perform better tasks allocation in case the individual developers are over-utilized or moved on to different roles/projects. We evaluated our approach on ten agile case study projects obtained from the Taiga.io repository. In order to determine the TaskAllocator's performance, we have conducted a benchmark study by comparing it with contemporary machine learning models. The applicability of the TaskAllocator was assessed through a plugin that can be integrated with JIRA and provides recommendations about suitable roles whenever a new task is added to the project. Lastly, the source code of the plugin and the dataset employed have been made public.
SEMay 27, 2020
Machine Learning for Software Engineering: A Systematic MappingSaad Shafiq, Atif Mashkoor, Christoph Mayr-Dorn et al.
Context: The software development industry is rapidly adopting machine learning for transitioning modern day software systems towards highly intelligent and self-learning systems. However, the full potential of machine learning for improving the software engineering life cycle itself is yet to be discovered, i.e., up to what extent machine learning can help reducing the effort/complexity of software engineering and improving the quality of resulting software systems. To date, no comprehensive study exists that explores the current state-of-the-art on the adoption of machine learning across software engineering life cycle stages. Objective: This article addresses the aforementioned problem and aims to present a state-of-the-art on the growing number of uses of machine learning in software engineering. Method: We conduct a systematic mapping study on applications of machine learning to software engineering following the standard guidelines and principles of empirical software engineering. Results: This study introduces a machine learning for software engineering (MLSE) taxonomy classifying the state-of-the-art machine learning techniques according to their applicability to various software engineering life cycle stages. Overall, 227 articles were rigorously selected and analyzed as a result of this study. Conclusion: From the selected articles, we explore a variety of aspects that should be helpful to academics and practitioners alike in understanding the potential of adopting machine learning techniques during software engineering projects.