SEMar 23, 2023Code
GiveMeLabeledIssues: An Open Source Issue Recommendation SystemJoseph Vargovich, Fabio Santos, Jacob Penney et al.
Developers often struggle to navigate an Open Source Software (OSS) project's issue-tracking system and find a suitable task. Proper issue labeling can aid task selection, but current tools are limited to classifying the issues according to their type (e.g., bug, question, good first issue, feature, etc.). In contrast, this paper presents a tool (GiveMeLabeledIssues) that mines project repositories and labels issues based on the skills required to solve them. We leverage the domain of the APIs involved in the solution (e.g., User Interface (UI), Test, Databases (DB), etc.) as a proxy for the required skills. GiveMeLabeledIssues facilitates matching developers' skills to tasks, reducing the burden on project maintainers. The tool obtained a precision of 83.9% when predicting the API domains involved in the issues. The replication package contains instructions on executing the tool and including new projects. A demo video is available at https://www.youtube.com/watch?v=ic2quUue7i8
SEApr 6, 2023Code
Tag that issue: Applying API-domain labels in issue tracking systemsFabio Santos, Joseph Vargovich, Bianca Trinkenreich et al.
Labeling issues with the skills required to complete them can help contributors to choose tasks in Open Source Software projects. However, manually labeling issues is time-consuming and error-prone, and current automated approaches are mostly limited to classifying issues as bugs/non-bugs. We investigate the feasibility and relevance of automatically labeling issues with what we call "API-domains," which are high-level categories of APIs. Therefore, we posit that the APIs used in the source code affected by an issue can be a proxy for the type of skills (e.g., DB, security, UI) needed to work on the issue. We ran a user study (n=74) to assess API-domain labels' relevancy to potential contributors, leveraged the issues' descriptions and the project history to build prediction models, and validated the predictions with contributors (n=20) of the projects. Our results show that (i) newcomers to the project consider API-domain labels useful in choosing tasks, (ii) labels can be predicted with a precision of 84% and a recall of 78.6% on average, (iii) the results of the predictions reached up to 71.3% in precision and 52.5% in recall when training with a project and testing in another (transfer learning), and (iv) project contributors consider most of the predictions helpful in identifying needed skills. These findings suggest our approach can be applied in practice to automatically label issues, assisting developers in finding tasks that better match their skills.
HCNov 12, 2023
Anticipating User Needs: Insights from Design Fiction on Conversational Agents for Computational ThinkingJacob Penney, João Felipe Pimentel, Igor Steinmacher et al.
Computational thinking, and by extension, computer programming, is notoriously challenging to learn. Conversational agents and generative artificial intelligence (genAI) have the potential to facilitate this learning process by offering personalized guidance, interactive learning experiences, and code generation. However, current genAI-based chatbots focus on professional developers and may not adequately consider educational needs. Involving educators in conceiving educational tools is critical for ensuring usefulness and usability. We enlisted nine instructors to engage in design fiction sessions in which we elicited abilities such a conversational agent supported by genAI should display. Participants envisioned a conversational agent that guides students stepwise through exercises, tuning its method of guidance with an awareness of the educational background, skills and deficits, and learning preferences. The insights obtained in this paper can guide future implementations of tutoring conversational agents oriented toward teaching computational thinking and computer programming.
82.6SEMay 18Code
Restructure This: Using AI to Restructure Onboarding Documents to Reduce Cognitive OverloadZixuan Feng, Prashant Tandan, Igor Steinmacher et al.
Onboarding documentation is critical for attracting and retaining newcomers in open source software (OSS). However, it is often presented as dense, inconsistently structured, and fragmented presentations that are difficult to understand, which creates cognitive overload leading to frustration, errors, and abandonment. Here, we investigate how Cognitive Theory of Multimedia Learning (CTML) strategies can be used to restructure OSS documentation. We use a GenAI-based pipeline to operationalize these strategies to restructure OSS documentation through our prototype VisDoc. VisDoc segments documentation into task-based units, infers workflows, removes redundancy, and generates multimodal explanations. An expert evaluation (N=4) affirmed VisDoc's completeness, accuracy, and adoptability; A between-subjects evaluation (N=14) with newcomers found that VisDoc participants achieved higher task success, had significantly lower cognitive load, and perceived higher usability. The contributions of this work include a CTML-grounded analysis of onboarding challenges, a GenAI-based documentation restructuring pipeline, and empirical evidence that cognitively informed documentation restructuring reduces cognitive load and improves usability and task performance in OSS.
28.5SEMar 25Code
Governance in Practice: How Open Source Projects Define and Document RolesPedro Oliveira, Tayana Conte, Marco Gerosa et al.
Open source software (OSS) sustainability depends not only on code contributions but also on governance structures that define who decides, who acts, and how responsibility is distributed. We lack systematic empirical evidence of how projects formally codify roles and authority in written artifacts. This paper investigates how OSS projects define and structure governance through their GOVERNANCE.md files and related documents. We analyze governance as an institutional infrastructure, a set of explicit rules that shape participation, decision rights, and community memory. We used Institutional Grammar to extract and formalize role definitions from repositories hosted on GitHub. We decompose each role into scope, privileges, obligations, and life-cycle rules to compare role structures across communities. Our results show that although OSS projects use a stable set of titles, identical titles carry different responsibilities, and different labels describe similar functions, which we call role drift. Still, we observed that a few actors sometimes accumulate technical, managerial, and community duties. %This creates the Maintainer Paradox: those who enable broad participation simultaneously become governance bottlenecks. By understanding authority and responsibilities in OSS, our findings inform researchers and practitioners on the importance of designing clearer roles, distributing work, and reducing leadership overload to support healthier and more sustainable communities.
19.2SEMay 7Code
Guidelines for Cultivating a Sense of Belonging to Reduce Developer BurnoutBianca Trinkenreich, Marco Aurelio Gerosa, Anita Sarma et al.
Burnout affects software developers' mental and physical well-being and contributes to turnover, generating strong concerns in the software industry. Prior research has shown that lack of belonging is associated with higher levels of burnout among software developers, while a sense of belonging is linked to resilience, job satisfaction, engagement, and well-being. In this paper, we revisit recent studies on belongingness in software development teams, including proprietary software organizations and open-source software communities, to offer evidence-based guidelines for cultivating belongingness and reducing developer burnout. We summarize characteristics of belongingness, such as trust, acceptance, value recognition, friendship, membership, mutual support, and being known by others, as well as factors associated with belongingness, including recognition, psychological safety, intrinsic motivation, English confidence, tenure, gender, and cultural power distance. Based on these findings, we propose practical guidelines for leaders and communities, including timely and consistent recognition, transparent promotion rules, inclusive benefits and initiatives, intentional connections through collaborative tools, blameless postmortems, optional in-person opportunities, informal newcomer gatherings, and continuous monitoring of belongingness and burnout. These guidelines can help software organizations and open-source communities foster healthier, more inclusive environments that support developer well-being.
5.3SEMay 7Code
Analyzing the Adoption of Database Management Systems Throughout the History of Open Source ProjectsCamila A. Paiva, Raquel Maximino, Frederico Paiva et al.
Database Management Systems (DBMSs) are widely used to store, retrieve, and manage the data handled by modern applications. Although prior work has studied the co-evolution of DBMSs and application source code, less is known about DBMS adoption, co-use, and replacement in real systems. This paper presents a historical study of DBMS usage in 362 popular open-source Java projects hosted on GitHub. We investigated the adoption of the top DBMSs ranked by DB-Engines, covering relational and non-relational systems. Using source-code heuristics, we analyzed DBMS popularity, stability, migration patterns, co-occurrence, and the role of Object-Relational Mappers (ORMs). Our findings show that MySQL and PostgreSQL are the most popular DBMSs in our corpus. Among non-relational DBMSs, Redis and MongoDB are the most frequently used and tend to remain stable after adoption. In contrast, systems such as HyperSQL are more often replaced as projects evolve. We also observed frequent co-use of multiple DBMSs, suggesting patterns of polyglot persistence in which projects combine systems to handle different data needs. Finally, we found that ORM frameworks are commonly used to mediate interactions between applications and DBMSs. Overall, our study provides empirical evidence on how DBMSs are adopted, combined, and replaced over time, offering guidance for developers, architects, educators, and DBMS vendors.
48.3SEApr 5
The Fast and Spurious: Developer Productivity with GenAISadia Afroz, Zixuan Feng, Tyler Menezes et al.
Generative AI (GenAI) tools are increasingly being adopted in software development as productivity aids, since there is evidence that GenAI tools can improve individual aspects of productivity. However, productivity is multidimensional; accelerating one aspect of work may simply shift effort to another. In this paper, we investigate how GenAI adoption affects different dimensions of developer productivity. We surveyed 415 software practitioners to understand how they perceive productivity changes associated with AI adoption, using the SPACE framework (Satisfaction and well-being, Performance, Activity, Communication and collaboration, and Efficiency and flow). Our results reveal systematic redistribution of effort across SPACE dimensions. While frequent GenAI users reported faster task completion and higher output volume, these gains were offset by increased code review burden, persistent cognitive load from output verification, and unchanged collaboration patterns. We further provide an empirical mapping between the challenges perceived by developers and potential strategies to mitigate them. Overall, our findings suggest that, at the current stage of GenAI adoption, perceived productivity gains may be spurious -- surface-level acceleration, often accompanied by redistributed effort and hidden costs.
SEJan 9, 2024Code
Applying Large Language Models API to Issue Classification ProblemGabriel Aracena, Kyle Luster, Fabio Santos et al.
Effective prioritization of issue reports is crucial in software engineering to optimize resource allocation and address critical problems promptly. However, the manual classification of issue reports for prioritization is laborious and lacks scalability. Alternatively, many open source software (OSS) projects employ automated processes for this task, albeit relying on substantial datasets for adequate training. This research seeks to devise an automated approach that ensures reliability in issue prioritization, even when trained on smaller datasets. Our proposed methodology harnesses the power of Generative Pre-trained Transformers (GPT), recognizing their potential to efficiently handle this task. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports accurately, mitigating the necessity for extensive training data while maintaining reliability. In our research, we have developed a reliable GPT-based approach to accurately label and prioritize issue reports with a reduced training dataset. By reducing reliance on massive data requirements and focusing on few-shot fine-tuning, our methodology offers a more accessible and efficient solution for issue prioritization in software engineering. Our model predicted issue types in individual projects up to 93.2% in precision, 95% in recall, and 89.3% in F1-score.
SEJan 27, 2025Code
SkillScope: A Tool to Predict Fine-Grained Skills Needed to Solve Issues on GitHubBenjamin C. Carter, Jonathan Rivas Contreras, Carlos A. Llanes Villegas et al.
New contributors often struggle to find tasks that they can tackle when onboarding onto a new Open Source Software (OSS) project. One reason for this difficulty is that issue trackers lack explanations about the knowledge or skills needed to complete a given task successfully. These explanations can be complex and time-consuming to produce. Past research has partially addressed this problem by labeling issues with issue types, issue difficulty level, and issue skills. However, current approaches are limited to a small set of labels and lack in-depth details about their semantics, which may not sufficiently help contributors identify suitable issues. To surmount this limitation, this paper explores large language models (LLMs) and Random Forest (RF) to predict the multilevel skills required to solve the open issues. We introduce a novel tool, SkillScope, which retrieves current issues from Java projects hosted on GitHub and predicts the multilevel programming skills required to resolve these issues. In a case study, we demonstrate that SkillScope could predict 217 multilevel skills for tasks with 91% precision, 88% recall, and 89% F-measure on average. Practitioners can use this tool to better delegate or choose tasks to solve in OSS projects.
SEOct 24, 2025Code
A Comparison of Conversational Models and Humans in Answering Technical Questions: the Firefox CaseJoao Correia, Daniel Coutinho, Marco Castelluccio et al.
The use of Large Language Models (LLMs) to support tasks in software development has steadily increased over recent years. From assisting developers in coding activities to providing conversational agents that answer newcomers' questions. In collaboration with the Mozilla Foundation, this study evaluates the effectiveness of Retrieval-Augmented Generation (RAG) in assisting developers within the Mozilla Firefox project. We conducted an empirical analysis comparing responses from human developers, a standard GPT model, and a GPT model enhanced with RAG, using real queries from Mozilla's developer chat rooms. To ensure a rigorous evaluation, Mozilla experts assessed the responses based on helpfulness, comprehensiveness, and conciseness. The results show that RAG-assisted responses were more comprehensive than human developers (62.50% to 54.17%) and almost as helpful (75.00% to 79.17%), suggesting RAG's potential to enhance developer assistance. However, the RAG responses were not as concise and often verbose. The results show the potential to apply RAG-based tools to Open Source Software (OSS) to minimize the load to core maintainers without losing answer quality. Toning down retrieval mechanisms and making responses even shorter in the future would enhance developer assistance in massive projects like Mozilla Firefox.
SEMay 30, 2025Code
Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New ModelsGabriel Aracena, Kyle Luster, Fabio Santos et al.
Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for prioritizing issue reports, mitigating the necessity for extensive training data while also maintaining reliability in classification. In our research, we developed an LLM-based approach for accurately labeling issues by selecting two of the most prominent large language models. We then compared their performance across multiple datasets. Our findings show that GPT-4o achieved the best results in classifying issues from the NLBSE 2024 competition. Moreover, GPT-4o outperformed DeepSeek R1, achieving an F1 score 20% higher when both models were trained on the same dataset from the NLBSE 2023 competition, which was ten times larger than the NLBSE 2024 dataset. The fine-tuned GPT-4o model attained an average F1 score of 80.7%, while the fine-tuned DeepSeek R1 model achieved 59.33%. Increasing the dataset size did not improve the F1 score, reducing the dependence on massive datasets for building an efficient solution to issue classification.
SEFeb 27, 2022Code
How to Debug Inclusivity Bugs? A Debugging Process with Information ArchitectureMariam Guizani, Igor Steinmacher, Jillian Emard et al.
Although some previous research has found ways to find inclusivity bugs (biases in software that introduce inequities), little attention has been paid to how to go about fixing such bugs. Without a process to move from finding to fixing, acting upon such findings is an ad-hoc activity, at the mercy of the skills of each individual developer. To address this gap, we created Why/Where/Fix, a systematic inclusivity debugging process whose inclusivity fault localization harnesses Information Architecture(IA) -- the way user-facing information is organized, structured and labeled. We then conducted a multi-stage qualitative empirical evaluation of the effectiveness of Why/Where/Fix, using an Open Source Software (OSS) project's infrastructure as our setting. In our study, the OSS project team used the Why/Where/Fix process to find inclusivity bugs, localize the IA faults behind them, and then fix the IA to remove the inclusivity bugs they had found. Our results showed that using Why/Where/Fix reduced the number of inclusivity bugs that OSS newcomer participants experienced by 90%.
SEFeb 27, 2022Code
Perceptions of the State of D&I and D&I Initiative in the ASFMariam Guizani, Bianca Trinkenreich, Aileen Abril Castro-Guzman et al.
Open Source Software (OSS) Foundations and projects are investing in creating Diversity and Inclusion (D&I) initiatives. However, little is known about contributors' perceptions about the usefulness and success of such initiatives. We aim to close this gap by investigating how contributors perceive the state of D&I in their community. In collaboration with the Apache Software Foundation (ASF), we surveyed 600+ OSS contributors and conducted 11 follow-up interviews. We used mixed methods to analyze our data-quantitative analysis of Likert-scale questions and qualitative analysis of open-ended survey question and the interviews to understand contributors' perceptions and critiques of the D&I initiative and how to improve it. Our results indicate that the ASF contributors felt that the state of D&I was still lacking, especially regarding gender, seniority, and English proficiency. Regarding the D&I initiative, some participants felt that the effort was unnecessary, while others agreed with the effort but critiqued its implementation. These findings show that D&I initiatives in OSS communities are a good start, but there is room for improvements. Our results can inspire the creation of new and the refinement of current initiatives.
SEMay 18, 2021Code
Pots of Gold at the End of the Rainbow: What is Success for Open Source Contributors?Bianca Trinkenreich, Mariam Guizani, Igor Wiese et al.
Success in Open Source Software (OSS) is often perceived as an exclusively code-centric endeavor. This perception can exclude a variety of individuals with a diverse set of skills and backgrounds, in turn helping create the current diversity & inclusion imbalance in OSS. Because people's perspectives of success affect their personal, professional, and life choices, to be able to support a diverse class of individuals, we must first understand what OSS contributors consider successful. Thus far, research has used a uni-dimensional, code-centric lens to define success. In this paper, we challenge this status-quo and reveal the multi-faceted definition of success among OSS contributors. We do so through interviews with 27 OSS contributors who are recognized as successful in their communities, and a follow-up open survey with 193 OSS contributors. Our study provides nuanced definitions of success perceptions in OSS, which might help devise strategies to attract and retain a diverse set of contributors, helping them attain their "pots of gold at the end of the rainbow".
SEMay 18, 2021Code
Women's Participation in Open Source Software: A Survey of the LiteratureBianca Trinkenreich, Igor Wiese, Anita Sarma et al.
Participation of women in Open Source Software (OSS) is very unbalanced, despite various efforts to improve diversity. This is concerning not only because women do not get the chance of career and skill developments afforded by OSS, but also because OSS projects suffer from a lack of diversity of thoughts because of a lack of diversity in their projects. Studies that characterize women's participation and investigate how to attract and retain women are spread across multiple fields, including information systems, software engineering, and social science. This paper systematically maps, aggregates, and synthesizes the state-of-the-art on women's participation in Open Source Software. It focuses on women's representation and the demographics of women who contribute to OSS, how they contribute, the acceptance rates of their contributions, their motivations and challenges, and strategies employed by communities to attract and retain women. We identified 51 articles (published between 2005 and 2021) that investigate women's participation in OSS. According to the literature, women represent about 9.8\% of OSS contributors; most of them are recent contributors, 20-37 years old, devote less than 5h/week to OSS, and make both non-code and code contributions. Only 5\% of projects have women as core developers, and women author less than 5\% of pull-requests but have similar or even higher rates of merge acceptance than men. Besides learning new skills and altruism, reciprocity and kinship are motivations especially relevant for women but can leave if they are not compensated for their contributions. Women's challenges are mainly social, including lack of peer parity and non-inclusive communication from a toxic culture. The literature reports ten strategies, which were mapped to six of the seven challenges. Based on these results, we provide guidelines for future research and practice.
SEMar 25, 2021Code
Don't Disturb Me: Challenges of Interacting with SoftwareBots on Open Source Software ProjectsMairieli Wessel, Igor Wiese, Igor Steinmacher et al.
Software bots are used to streamline tasks in Open Source Software (OSS) projects' pull requests, saving development cost, time, and effort. However, their presence can be disruptive to the community. We identified several challenges caused by bots in pull request interactions by interviewing 21 practitioners, including project maintainers, contributors, and bot developers. In particular, our findings indicate noise as a recurrent and central problem. Noise affects both human communication and development workflow by overwhelming and distracting developers. Our main contribution is a theory of how human developers perceive annoying bot behaviors as noise on social coding platforms. This contribution may help practitioners understand the effects of adopting a bot, and researchers and tool designers may leverage our results to better support human-bot interaction on social coding platforms.
SEMar 25, 2021Code
Quality Gatekeepers: Investigating the Effects ofCode Review Bots on Pull Request ActivitiesMairieli Wessel, Alexander Serebrenik, Igor Wiese et al.
Software bots have been facilitating several development activities in Open Source Software (OSS) projects, including code review. However, these bots may bring unexpected impacts to group dynamics, as frequently occurs with new technology adoption. Understanding and anticipating such effects is important for planning and management. To analyze these effects, we investigate how several activity indicators change after the adoption of a code review bot. We employed a regression discontinuity design on 1,194 software projects from GitHub. We also interviewed 12 practitioners, including open-source maintainers and contributors. Our results indicate that the adoption of code review bots increases the number of monthly merged pull requests, decreases monthly non-merged pull requests, and decreases communication among developers. From the developers' perspective, these effects are explained by the transparency and confidence the bot comments introduce, in addition to the changes in the discussion focused on pull requests. Practitioners and maintainers may leverage our results to understand, or even predict, bot effects on their projects.
SEMar 23, 2021Code
Can I Solve It? Identifying APIs Required to Complete OSS TaskFabio Santos, Igor Wiese, Bianca Trinkenreich et al.
Open Source Software projects add labels to open issues to help contributors choose tasks. However, manually labeling issues is time-consuming and error-prone. Current automatic approaches for creating labels are mostly limited to classifying issues as a bug/non-bug. In this paper, we investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks. We leverage the issues' description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%. We also ran a user study (n=74) to assess these labels' relevancy to potential contributors. The results show that the labels were useful to participants in choosing tasks, and the API-domain labels were selected more often than the existing architecture-based labels. Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.
SEMar 8, 2021Code
Will You Come Back to Contribute? Investigating the Inactivity of OSS Core Developers in GitHubFabio Calefato, Marco Aurelio Gerosa, Giuseppe Iaffaldano et al.
Several Open Source Software (OSS) projects depend on the continuity of their development communities to remain sustainable. Understanding how developers become inactive or why they take breaks can help communities prevent abandonment and incentivize developers to come back. In this paper, we propose a novel method to identify developers' inactive periods by analyzing the individual rhythm of contributions to the projects. Using this method, we quantitatively analyze the inactivity of core developers in 18 OSS organizations hosted on GitHub. We also survey core developers to receive their feedback about the identified breaks and transitions. Our results show that our method was effective for identifying developers' breaks. About 94% of the surveyed core developers agreed with our state model of inactivity; 71% and 79% of them acknowledged their breaks and state transition, respectively. We also show that all core developers take breaks (at least once) and about a half of them (~45%}) have completely disengaged from a project for at least one year. We also analyzed the probability of transitions to/from inactivity and found that developers who pause their activity have a ~35-55\% chance to return to an active state; yet, if the break lasts for a year or longer, then the probability of resuming activities drops to ~21-26%, with a ~54% chance of complete disengagement. These results may support the creation of policies and mechanisms to make OSS community managers aware of breaks and potential project abandonment.
SEJan 25, 2021Code
The Shifting Sands of Motivation: Revisiting What Drives Contributors in Open SourceMarco Gerosa, Igor Wiese, Bianca Trinkenreich et al.
Open Source Software (OSS) has changed drastically over the last decade, with OSS projects now producing a large ecosystem of popular products, involving industry participation, and providing professional career opportunities. But our field's understanding of what motivates people to contribute to OSS is still fundamentally grounded in studies from the early 2000s. With the changed landscape of OSS, it is very likely that motivations to join OSS have also evolved. Through a survey of 242 OSS contributors, we investigate shifts in motivation from three perspectives: (1) the impact of the new OSS landscape, (2) the impact of individuals' personal growth as they become part of OSS communities, and (3) the impact of differences in individuals' demographics. Our results show that some motivations related to social aspects and reputation increased in frequency and that some intrinsic and internalized motivations, such as learning and intellectual stimulation, are still highly relevant. We also found that contributing to OSS often transforms extrinsic motivations to intrinsic, and that while experienced contributors often shift toward altruism, novices often shift toward career, fun, kinship, and learning. OSS projects can leverage our results to revisit current strategies to attract and retain contributors, and researchers and tool builders can better support the design of new studies and tools to engage and support OSS development.
SEOct 13, 2019Code
Google Summer of Code: Student Motivations and ContributionsJefferson O. Silva, Igor Wiese, Daniel M. German et al.
Several open source software (OSS) projects expect to foster newcomers' onboarding and to receive contributions by participating in engagement programs, like Summers of Code. However, there is little empirical evidence showing why students join such programs. In this paper, we study the well-established Google Summer of Code (GSoC), which is a 3-month OSS engagement program that offers stipends and mentors to students willing to contribute to OSS projects. We combined a survey (students and mentors) and interviews (students) to understand what motivates students to enter GSoC. Our results show that students enter GSoC for an enriching experience, not necessarily to become frequent contributors. Our data suggest that, while the stipends are an important motivator, the students participate for work experience and the ability to attach the name of the supporting organization to their resumés. We also discuss practical implications for students, mentors, OSS projects, and Summer of Code programs.
SEMar 22, 2019Code
Why do developers take breaks from contributing to OSS projects? A preliminary analysisGiuseppe Iaffaldano, Igor Steinmacher, Fabio Calefato et al.
Creating a successful and sustainable Open Source Software (OSS) project often depends on the strength and the health of the community behind it. Current literature explains the contributors' lifecycle, starting with the motivations that drive people to contribute and barriers to joining OSS projects, covering developers' evolution until they become core members. However, the stages when developers leave the projects are still weakly explored and are not well-defined in existing developers' lifecycle models. In this position paper, we enrich the knowledge about the leaving stage by identifying sleeping and dead states, representing temporary and permanent brakes that developers take from contributing. We conducted a preliminary set of semi-structured interviews with active developers. We analyzed the answers by focusing on defining and understanding the reasons for the transitions to/from sleeping and dead states. This paper raises new questions that may guide further discussions and research, which may ultimately benefit OSS communities.
SEMay 17, 2021
Buying time in software development: how estimates become commitments?Patricia Matsubara, Igor Steinmacher, Bruno Gadelha et al.
Despite years of research for improving accuracy, software practitioners still face software estimation difficulties. Expert judgment has been the prevalent method used in industry, and researchers' focus on raising realism in estimates when using it seems not to be enough for the much-expected improvements. Instead of focusing on the estimation process's technicalities, we investigated the interaction of the establishment of commitments with customers and software estimation. By observing estimation sessions and interviewing software professionals from companies in varying contexts, we found that defensible estimates and padding of software estimates are crucial in converting estimates into commitments. Our findings show that software professionals use padding for three different reasons: contingency buffer, completing other tasks, or improving the overall quality of the product. The reasons to pad have a common theme: buying time to balance short- and long-term software development commitments, including the repayment of technical debt. Such a theme emerged from the human aspects of the interaction of estimation and the establishment of commitments: pressures and customers' conflicting short and long-term needs play silent and unrevealed roles in-between the technical activities. Therefore, our study contributes to untangling the underlying phenomena, showing how the practices used by software practitioners help to deal with the human and social context in which estimation is embedded.
SEFeb 3, 2020
Analyzing the evolution and diversity of SBES Program CommitteeFabio Pacheco, Igor Wiese, Bruno Cartaxo et al.
The Brazilian Symposium on Software Engineering (SBES) is one of the most important Latin American Software Engineering conferences. It was first held in 1987, and in 2019 marks its 33rd edition. Over these years, many researchers have participated in SBES, attending the conference, submitting, and reviewing papers. The researchers who participate in the Program Committee (PC) and perform the reviewers' role are fundamentally important to SBES, since their evaluations (e.g., deciding whether a paper is accepted or not) have the potential of drawing what SBES is now. Knowing that diversity is an important aspect of any group work, we wanted to understand diversity in the SBES PC community. We investigated a number of characteristics of SBES PC members, including their gender and geographic location. We also analyzed the turnover and renovation of the committee. Among the findings, we observed that although the number of participants in the SBES PC has increased over the years, most of them are men (~80%) and from the Southeast and Northeast of Brazil, with very few members from the North region. We also observed that there is a small turnover: during the 2010 decade, only 11% of new members were added to the PC. Finally, we investigated the participation of the PC members publishing papers at SBES. We observed that only 24% of the papers accepted to SBES were authored by members who were not committee members of the respective year. Moreover, committee members usually do not collaborate among themselves: a significant number of the papers are authored by the PC members and students. This paper may contribute to the SBES community, in particular, its special interest group, in understanding the needs and challenges of the PC's participants.
SEOct 31, 2019
Challenges for Inclusion in Software Engineering: The Case of the Emerging Papua New Guinean SocietyRaula Gaikovina Kula, Christoph Treude, Hideaki Hata et al.
Software plays a central role in modern societies, with its high economic value and potential for advancing societal change. In this paper, we characterise challenges and opportunities for a country progressing towards entering the global software industry, focusing on Papua New Guinea (PNG). By hosting a Software Engineering workshop, we conducted a qualitative study by recording talks (n=3), employing a questionnaire (n=52), and administering an in-depth focus group session with local actors (n=5). Based on a thematic analysis, we identified challenges as barriers and opportunities for the PNG software engineering community. We also discuss the state of practices and how to make it inclusive for practitioners, researchers, and educators from both the local and global software engineering community.