SEMay 15Code
Psychological Safety Framework in Pull-based Open Source ProjectsEmeralda Sesari, Federica Sarro, Ayushi Rastogi
Psychological safety refers to the belief that team members can speak up or make mistakes without fear of negative consequences. While it is recognized as important in traditional software teams, its role in open-source software development remains understudied. Open-source contributors often collaborate without formal roles or structures, where interpersonal relationships can significantly influence participation. Code review, a central and collaborative activity in modern software development, offers a valuable context for observing such team interactions. This paper introduces a framework grounded in psychological safety theory to identify behaviors that signal PS in pull request interactions. We operationalize these behaviors using 10 observable variables derived from 60,684 PRs across 26 popular GitHub repositories and construct a PS index at repository level. We then empirically test the relationship between this index and contributors' short-term (within 1 year) and long-term (over 4-5 years) sustained participation using three logistic regression models. Contributors are more likely to remain active in repositories with higher levels of psychological safety. Psychological safety is positively associated with both short-term and long-term sustained participation. However, prior participation emerges as a stronger predictor of future engagement, reducing the effect of psychological safety when accounted for. This study introduces a a theory-informed framework for measuring psychological safety through pull request data and provides empirical evidence of its relevance in sustaining participation within open-source development.
SEMay 28, 2021Code
Pull Request Decision Explained: An Empirical OverviewXunhui Zhang, Yue Yu, Georgios Gousios et al.
Context: Pull-based development model is widely used in open source, leading the trends in distributed software development. One aspect which has garnered significant attention is studies on pull request decision - identifying factors for explanation. Objective: This study builds on a decade long research on pull request decision to explain it. We empirically investigate how factors influence pull request decision and scenarios that change the influence of factors. Method: We identify factors influencing pull request decision on GitHub through a systematic literature review and infer it by mining archival data. We collect a total of 3,347,937 pull requests with 95 features from 11,230 diverse projects on GitHub. Using this data, we explore the relations of the factors to each other and build mixed-effect logistic regression models to empirically explain pull request decision. Results: Our study shows that a small number of factors explain pull request decision with the integrator same or different from the submitter as the most important factor. We also noted that some factors are important only in special cases e.g., the percentage of failed builds is important for pull request decision when continuous integration is used.
CYOct 2, 2020Code
Including Everyone, Everywhere: Understanding Opportunities and Challenges of Geographic Gender-Inclusion in OSSGede Artha Azriadi Prana, Denae Ford, Ayushi Rastogi et al.
The gender gap is a significant concern facing the software industry as the development becomes more geographically distributed. Widely shared reports indicate that gender differences may be specific to each region. However, how complete can these reports be with little to no research reflective of the Open Source Software (OSS) process and communities software is now commonly developed in? Our study presents a multi-region geographical analysis of gender inclusion on GitHub. This mixed-methods approach includes quantitatively investigating differences in gender inclusion in projects across geographic regions and investigate these trends over time using data from contributions to 21,456 project repositories. We also qualitatively understand the unique experiences of developers contributing to these projects through a survey that is strategically targeted to developers in various regions worldwide. Our findings indicate that gender diversity is low across all parts of the world, with no substantial difference across regions. However, there has been statistically significant improvement in diversity worldwide since 2014, with certain regions such as Africa improving at faster pace. We also find that most motivations and barriers to contributions (e.g., lack of resources to contribute and poor working environment) were shared across regions, however, some insightful differences, such as how to make projects more inclusive, did arise. From these findings, we derive and present implications for tools that can foster inclusion in open source software communities and empower contributions from everyone, everywhere.
SEMar 24
Machine Learning Models for the Early Detection of Burnout in Software Engineering: a Systematic Literature ReviewTien Rahayu Tulili, Ayushi Rastogi, Andrea Capiluppi
Burnout is an occupational syndrome that, like many other professions, affects the majority of software engineers. Past research studies showed important trends, including an increasing use of machine learning techniques to allow for an early detection of burnout. This paper is a systematic literature review (SLR) of the research papers that proposed machine learning (ML) approaches, and focused on detecting burnout in software developers and IT professionals. Our objective is to review the accuracy and precision of the proposed ML techniques, and to formulate recommendations for future researchers interested to replicate or extend those studies. From our SLR we observed that a majority of primary studies focuses on detecting emotions or utilise emotional dimensions to detect or predict the presence of burnout. We also performed a cross-sectional study to detect which ML approach shows a better performance at detecting emotions; and which dataset has more potential and expressivity to capture emotions. We believe that, by identifying which ML tools and datasets show a better performance at detecting emotions, and indirectly at identifying burnout, our paper can be a valuable asset to progress in this important research direction.
CLOct 29, 2024
Is Our Chatbot Telling Lies? Assessing Correctness of an LLM-based Dutch Support ChatbotHerman Lassche, Michiel Overeem, Ayushi Rastogi
Companies support their customers using live chats and chatbots to gain their loyalty. AFAS is a Dutch company aiming to leverage the opportunity large language models (LLMs) offer to answer customer queries with minimal to no input from its customer support team. Adding to its complexity, it is unclear what makes a response correct, and that too in Dutch. Further, with minimal data available for training, the challenge is to identify whether an answer generated by a large language model is correct and do it on the fly. This study is the first to define the correctness of a response based on how the support team at AFAS makes decisions. It leverages literature on natural language generation and automated answer grading systems to automate the decision-making of the customer support team. We investigated questions requiring a binary response (e.g., Would it be possible to adjust tax rates manually?) or instructions (e.g., How would I adjust tax rate manually?) to test how close our automated approach reaches support rating. Our approach can identify wrong messages in 55\% of the cases. This work demonstrates the potential for automatically assessing when our chatbot may provide incorrect or misleading answers. Specifically, we contribute (1) a definition and metrics for assessing correctness, and (2) suggestions to improve correctness with respect to regional language and question type.
SEJan 25, 2022
The Unexplored Terrain of Compiler WarningsGunnar Kudrjavets, Aditya Kumar, Nachiappan Nagappan et al.
The authors' industry experiences suggest that compiler warnings, a lightweight version of program analysis, are valuable early bug detection tools. Significant costs are associated with patches and security bulletins for issues that could have been avoided if compiler warnings were addressed. Yet, the industry's attitude towards compiler warnings is mixed. Practices range from silencing all compiler warnings to having a zero-tolerance policy as to any warnings. Current published data indicates that addressing compiler warnings early is beneficial. However, support for this value theory stems from grey literature or is anecdotal. Additional focused research is needed to truly assess the cost-benefit of addressing warnings.
SEAug 23, 2021
Pull Request Latency Explained: An Empirical OverviewXunhui Zhang, Yue Yu, Tao Wang et al.
Pull request latency evaluation is an essential application of effort evaluation in the pull-based development scenario. It can help the reviewers sort the pull request queue, remind developers about the review processing time, speed up the review process and accelerate software development. There is a lack of work that systematically organizes the factors that affect pull request latency. Also, there is no related work discussing the differences and variations in characteristics in different scenarios and contexts. In this paper, we collected relevant factors through a literature review approach. Then we assessed their relative importance in five scenarios and six different contexts using the mixed-effects linear regression model. We find that the relative importance of factors differs in different scenarios, e.g., the first response time of the reviewer is most important when there exist comments. Meanwhile, the number of commits in a pull request has a more significant impact on pull request latency when closing than submitting due to changes in contributions brought about by the review process.
SEJul 13, 2021
Promises and Perils of Inferring Personality on GitHubFrenk van Mil, Ayushi Rastogi, Andy Zaidman
Personality plays a pivotal role in our understanding of human actions and behavior. Today, the applications of personality are widespread, built on the solutions from psychology to infer personality. In software engineering, for instance, one widely used solution to infer personality uses textual communication data. As studies on personality in software engineering continue to grow, it is imperative to understand the performance of these solutions. This paper compares the inferential ability of three widely studied text-based personality tests against each other and the ground truth on GitHub. We explore the challenges and potential solutions to improve the inferential ability of personality tests. Our study shows that solutions for inferring personality are far from being perfect. Software engineering communications data can infer individual developer personality with an average error rate of 41%. In the best case, the error rate can be reduced up to 36% by following our recommendations.
SEJun 3, 2021
How does Software Change?Ayushi Rastogi, Georgios Gousios
Software evolves with changes to its codebase over time. Internally, software changes in response to decisions to include some code change into the codebase and discard others. Explaining the mechanism of software evolution, this paper presents a theory of software change. Our theory is grounded in multiple evidence sources (e.g., GitHub documentation and relevant scientific literature) relating to the pull-based development model in GitHub. The resulting theory explains the influence of project-related core concepts (e.g., people and governance) as well as its ecosystem on the decision of software change.
SEOct 7, 2020
Questions for Data Scientists in Software Engineering: A ReplicationHennie Huijgens, Ayushi Rastogi, Ernst Mulders et al.
In 2014, a Microsoft study investigated the sort of questions that data science applied to software engineering should answer. This resulted in 145 questions that developers considered relevant for data scientists to answer, thus providing a research agenda to the community. Fast forward to five years, no further studies investigated whether the questions from the software engineers at Microsoft hold for other software companies, including software-intensive companies with different primary focus (to which we refer as software-defined enterprises). Furthermore, it is not evident that the problems identified five years ago are still applicable, given the technological advances in software engineering.