SEApr 24Code
Do Good, Stay Longer? Temporal Patterns and Predictors of Newcomer-to-Core Transitions in Conventional OSS and OSS4SGMohamed Ouf, Amr Mohamed, Mariam Guizani
Open Source Software (OSS) sustainability relies on newcomers transitioning to core contributors, but this pipeline is broken, with most newcomers becoming inactive after initial contributions. Open Source Software for Social Good (OSS4SG) projects, which prioritize societal impact as their primary mission, may be associated with different newcomer-to-core transition outcomes than conventional OSS projects. We compared 375 projects (190 OSS4SG, 185 OSS), analyzing 92,721 contributors and 3.5 million commits. OSS4SG projects retain contributors at 2.2X higher rates and contributors have 19.6% higher probability of achieving core status. Early broad project exploration predicts core achievement (22.2% importance); conventional OSS concentrates on one dominant pathway (61.62% of transitions) while OSS4SG provides multiple pathways. Contrary to intuition, contributors who invest time learning the project before intensifying their contributions (Late Spike pattern) achieve core status 2.4-2.9X faster (21 weeks) than those who contribute intensively from day one (Early Spike pattern, 51-60 weeks). OSS4SG supports two effective temporal patterns while only Late Spike achieves fastest time-to-core in conventional OSS. Our findings suggest that finding a project aligned with personal values and taking time to understand the codebase before major contributions are key strategies for achieving core status. Our findings show that project mission is associated with measurably different environments for newcomer-to-core transitions and provide evidence-based guidance for newcomers and maintainers.
HCApr 23Code
Same Project, Different Start: How Contribution Events Shape Activity and Retention in Open SourceMohamed Ouf, Mariam Guizani
Open source projects depend on newcomers who stay, yet most leave after a single contribution. Contribution events such as Google Summer of Code, LFX Mentorship, Hacktoberfest, and 24 Pull Requests attract thousands of newcomers each year, but whether they produce lasting contributors remains unclear. We conduct the first matched-cohort study comparing 2,001 event-based and 2,001 organic contributors across 330 projects. Our results reveal three key findings. First, event contributors have significantly higher odds of becoming core contributors (12.1% vs. 9.6%, p < 0.001, OR = 1.31) and stay significantly longer (median 8.2 vs. 4.8 months). Second, each entry mechanism is associated with a fundamentally different engagement rhythm: 68.9% of mentorship contributors sustain Steady weekly activity across their first 12 weeks, whereas 61.0% of non-mentorship contributors exhibit Front-Loading and 57.0% of organic contributors exhibit Intermittent engagement (p < 0.001). Third, Steady engagement is associated with significantly longer retention regardless of group (median 13 vs. 8 months for Front-Loading), yet mentorship contributors who lose their program scaffolding show shorter retention than self-sustained non-mentorship contributors, revealing a mentor-dependency effect. A newcomer's first 12 weeks are strongly indicative of their long-term trajectory.
SEMay 21
At What Cost? Software Developers' Well-Being in the Age of GenAIMariam Guizani, Maduka Subasinghage, Sherlock A. Licorish et al.
Generative Artificial Intelligence (GenAI) is rapidly reshaping software development, with growing emphasis on accelerating productivity and optimizing performance. However, excessive focus on such dimensions risks overlooking the critical implications for developer well-being. GenAI tools can amplify cognitive load, introduce new forms of oversight labor, and escalate expectations around output and pace, contributing to stress, burnout, and diminished work-life balance. The GenAI movement is also transforming professional norms, altering career entry points, demanding continuous adaptation, and deepening inequalities in access and support. This position paper calls for a reorientation of the GenAI research agenda in software development and proposes a theoretical framework to move beyond narrow performance metrics toward investigations that also center on human experience, social context, and sustainable productivity.
SEFeb 27, 2022Code
How to Debug Inclusivity Bugs? A Debugging Process with Information ArchitectureMariam Guizani, Igor Steinmacher, Jillian Emard et al.
Although some previous research has found ways to find inclusivity bugs (biases in software that introduce inequities), little attention has been paid to how to go about fixing such bugs. Without a process to move from finding to fixing, acting upon such findings is an ad-hoc activity, at the mercy of the skills of each individual developer. To address this gap, we created Why/Where/Fix, a systematic inclusivity debugging process whose inclusivity fault localization harnesses Information Architecture(IA) -- the way user-facing information is organized, structured and labeled. We then conducted a multi-stage qualitative empirical evaluation of the effectiveness of Why/Where/Fix, using an Open Source Software (OSS) project's infrastructure as our setting. In our study, the OSS project team used the Why/Where/Fix process to find inclusivity bugs, localize the IA faults behind them, and then fix the IA to remove the inclusivity bugs they had found. Our results showed that using Why/Where/Fix reduced the number of inclusivity bugs that OSS newcomer participants experienced by 90%.
SEFeb 27, 2022Code
Perceptions of the State of D&I and D&I Initiative in the ASFMariam Guizani, Bianca Trinkenreich, Aileen Abril Castro-Guzman et al.
Open Source Software (OSS) Foundations and projects are investing in creating Diversity and Inclusion (D&I) initiatives. However, little is known about contributors' perceptions about the usefulness and success of such initiatives. We aim to close this gap by investigating how contributors perceive the state of D&I in their community. In collaboration with the Apache Software Foundation (ASF), we surveyed 600+ OSS contributors and conducted 11 follow-up interviews. We used mixed methods to analyze our data-quantitative analysis of Likert-scale questions and qualitative analysis of open-ended survey question and the interviews to understand contributors' perceptions and critiques of the D&I initiative and how to improve it. Our results indicate that the ASF contributors felt that the state of D&I was still lacking, especially regarding gender, seniority, and English proficiency. Regarding the D&I initiative, some participants felt that the effort was unnecessary, while others agreed with the effort but critiqued its implementation. These findings show that D&I initiatives in OSS communities are a good start, but there is room for improvements. Our results can inspire the creation of new and the refinement of current initiatives.
SEFeb 15, 2022Code
Attracting and Retaining OSS Contributors with a Maintainer DashboardMariam Guizani, Thomas Zimmermann, Anita Sarma et al.
Tools and artifacts produced by open source software (OSS) have been woven into the foundation of the technology industry. To keep this foundation intact, the open source community needs to actively invest in sustainable approaches to bring in new contributors and nurture existing ones. We take a first step at this by collaboratively designing a maintainer dashboard that provides recommendations on how to attract and retain open source contributors. For example, by highlighting project goals (e.g., a social good cause) to attract diverse contributors and mechanisms to acknowledge (e.g., a "rising contributor" badge) existing contributors. Next, we conduct a project-specific evaluation with maintainers to better understand use cases in which this tool will be most helpful at supporting their plans for growth. From analyzing feedback, we find recommendations to be useful at signaling projects as welcoming and providing gentle nudges for maintainers to proactively recognize emerging contributors. However, there are complexities to consider when designing recommendations such as the project current development state (e.g., deadlines, milestones, refactoring) and governance model. Finally, we distill our findings to share what the future of recommendations in open source looks like and how to make these recommendations most meaningful over time.
SEMay 18, 2021Code
Pots of Gold at the End of the Rainbow: What is Success for Open Source Contributors?Bianca Trinkenreich, Mariam Guizani, Igor Wiese et al.
Success in Open Source Software (OSS) is often perceived as an exclusively code-centric endeavor. This perception can exclude a variety of individuals with a diverse set of skills and backgrounds, in turn helping create the current diversity & inclusion imbalance in OSS. Because people's perspectives of success affect their personal, professional, and life choices, to be able to support a diverse class of individuals, we must first understand what OSS contributors consider successful. Thus far, research has used a uni-dimensional, code-centric lens to define success. In this paper, we challenge this status-quo and reveal the multi-faceted definition of success among OSS contributors. We do so through interviews with 27 OSS contributors who are recognized as successful in their communities, and a follow-up open survey with 193 OSS contributors. Our study provides nuanced definitions of success perceptions in OSS, which might help devise strategies to attract and retain a diverse set of contributors, helping them attain their "pots of gold at the end of the rainbow".
SEJul 3, 2025
The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Literature ReviewAmr Mohamed, Maram Assi, Mariam Guizani
Large language model assistants (LLM-assistants) present new opportunities to transform software development. Developers are increasingly adopting these tools across tasks, including coding, testing, debugging, documentation, and design. Yet, despite growing interest, there is no synthesis of how LLM-assistants affect software developer productivity. In this paper, we present a systematic literature review of 37 peer-reviewed studies published between January 2014 and December 2024 that examine this impact. Our analysis reveals that LLM-assistants offer both considerable benefits and critical risks. Commonly reported gains include minimized code search, accelerated development, and the automation of trivial and repetitive tasks. However, studies also highlight concerns around cognitive offloading, reduced team collaboration, and inconsistent effects on code quality. While the majority of studies (92%) adopt a multi-dimensional perspective by examining at least two SPACE dimensions, reflecting increased awareness of the complexity of developer productivity, only 14% extend beyond three dimensions, indicating substantial room for more integrated evaluations. Satisfaction, Performance, and Efficiency are the most frequently investigated dimensions, whereas Communication and Activity remain underexplored. Most studies are exploratory (64%) and methodologically diverse, but lack longitudinal and team-based evaluations. This review surfaces key research gaps and provides recommendations for future research and practice. All artifacts associated with this study are publicly available at https://zenodo.org/records/15788502.
SESep 25, 2025
Design, Implementation and Evaluation of a Novel Programming Language Topic Classification WorkflowMichael Zhang, Yuan Tian, Mariam Guizani
As software systems grow in scale and complexity, understanding the distribution of programming language topics within source code becomes increasingly important for guiding technical decisions, improving onboarding, and informing tooling and education. This paper presents the design, implementation, and evaluation of a novel programming language topic classification workflow. Our approach combines a multi-label Support Vector Machine (SVM) with a sliding window and voting strategy to enable fine-grained localization of core language concepts such as operator overloading, virtual functions, inheritance, and templates. Trained on the IBM Project CodeNet dataset, our model achieves an average F1 score of 0.90 across topics and 0.75 in code-topic highlight. Our findings contribute empirical insights and a reusable pipeline for researchers and practitioners interested in code analysis and data-driven software engineering.
SESep 23, 2025
Reverse Engineering User Stories from Code using Large Language ModelsMohamed Ouf, Haoyu Li, Michael Zhang et al.
User stories are essential in agile development, yet often missing or outdated in legacy and poorly documented systems. We investigate whether large language models (LLMs) can automatically recover user stories directly from source code and how prompt design impacts output quality. Using 1,750 annotated C++ snippets of varying complexity, we evaluate five state-of-the-art LLMs across six prompting strategies. Results show that all models achieve, on average, an F1 score of 0.8 for code up to 200 NLOC. Our findings show that a single illustrative example enables the smallest model (8B) to match the performance of a much larger 70B model. In contrast, structured reasoning via Chain-of-Thought offers only marginal gains, primarily for larger models.
HCFeb 27, 2022
A Decade of Information Architecture in HCI: A Systematic Literature ReviewMariam Guizani
Information Architecture (IA) is a blueprint for the information system in websites or other information-rich environments. It corresponds to how we organize, label and structure information. The importance of Information Architecture and its influence on a system's usability is vastly discussed in literature. Because of the inherent connection between Information Architecture concepts and the Human Computer Interaction (HCI) field, we decided to investigate how previous research has used Information Architecture in the context of Human Computer Interaction (IAinHCI). In order to do that, we followed a two phase process. First, we conducted a Systematic Literature Review (SLR). We queried both the ACM and IEEE databases. We filtered and assessed 311 papers that spanned a decade of research on Information Architecture. We found 25 papers that utilized Information Architecture in the context of Human Computer Interaction. Then, we followed a Background Reference Search process using the SLR resulting papers as a starting set. We assessed the eligibility of the reference list of all 25 papers and found eight additional papers that were relevant to our research question. Results of our review show that, IAinHCI papers fall under seven main categories, from IoT to the semantic web and ubiquitous technology. The website category, however, was both the most consistent over the years and the most prevalent category accounting for 67% of the papers. Our findings suggest that IA has not yet uncovered its full potential and there is still room for research to leverage and expend the IA knowledge base promising a prosperous future for Information Architecture.