Mohamed Ouf

SE
h-index12
4papers
45citations
Novelty39%
AI Score46

4 Papers

SEApr 24Code
Do Good, Stay Longer? Temporal Patterns and Predictors of Newcomer-to-Core Transitions in Conventional OSS and OSS4SG

Mohamed Ouf, Amr Mohamed, Mariam Guizani

Open Source Software (OSS) sustainability relies on newcomers transitioning to core contributors, but this pipeline is broken, with most newcomers becoming inactive after initial contributions. Open Source Software for Social Good (OSS4SG) projects, which prioritize societal impact as their primary mission, may be associated with different newcomer-to-core transition outcomes than conventional OSS projects. We compared 375 projects (190 OSS4SG, 185 OSS), analyzing 92,721 contributors and 3.5 million commits. OSS4SG projects retain contributors at 2.2X higher rates and contributors have 19.6% higher probability of achieving core status. Early broad project exploration predicts core achievement (22.2% importance); conventional OSS concentrates on one dominant pathway (61.62% of transitions) while OSS4SG provides multiple pathways. Contrary to intuition, contributors who invest time learning the project before intensifying their contributions (Late Spike pattern) achieve core status 2.4-2.9X faster (21 weeks) than those who contribute intensively from day one (Early Spike pattern, 51-60 weeks). OSS4SG supports two effective temporal patterns while only Late Spike achieves fastest time-to-core in conventional OSS. Our findings suggest that finding a project aligned with personal values and taking time to understand the codebase before major contributions are key strategies for achieving core status. Our findings show that project mission is associated with measurably different environments for newcomer-to-core transitions and provide evidence-based guidance for newcomers and maintainers.

HCApr 23Code
Same Project, Different Start: How Contribution Events Shape Activity and Retention in Open Source

Mohamed Ouf, Mariam Guizani

Open source projects depend on newcomers who stay, yet most leave after a single contribution. Contribution events such as Google Summer of Code, LFX Mentorship, Hacktoberfest, and 24 Pull Requests attract thousands of newcomers each year, but whether they produce lasting contributors remains unclear. We conduct the first matched-cohort study comparing 2,001 event-based and 2,001 organic contributors across 330 projects. Our results reveal three key findings. First, event contributors have significantly higher odds of becoming core contributors (12.1% vs. 9.6%, p < 0.001, OR = 1.31) and stay significantly longer (median 8.2 vs. 4.8 months). Second, each entry mechanism is associated with a fundamentally different engagement rhythm: 68.9% of mentorship contributors sustain Steady weekly activity across their first 12 weeks, whereas 61.0% of non-mentorship contributors exhibit Front-Loading and 57.0% of organic contributors exhibit Intermittent engagement (p < 0.001). Third, Steady engagement is associated with significantly longer retention regardless of group (median 13 vs. 8 months for Front-Loading), yet mentorship contributors who lose their program scaffolding show shorter retention than self-sustained non-mentorship contributors, revealing a mentor-dependency effect. A newcomer's first 12 weeks are strongly indicative of their long-term trajectory.

LGMay 2, 2024
CityLearn v2: Energy-flexible, resilient, occupant-centric, and carbon-aware management of grid-interactive communities

Kingsley Nweye, Kathryn Kaspar, Giacomo Buscemi et al.

As more distributed energy resources become part of the demand-side infrastructure, it is important to quantify the energy flexibility they provide on a community scale, particularly to understand the impact of geographic, climatic, and occupant behavioral differences on their effectiveness, as well as identify the best control strategies to accelerate their real-world adoption. CityLearn provides an environment for benchmarking simple and advanced distributed energy resource control algorithms including rule-based, model-predictive, and reinforcement learning control. CityLearn v2 presented here extends CityLearn v1 by providing a simulation environment that leverages the End-Use Load Profiles for the U.S. Building Stock dataset to create virtual grid-interactive communities for resilient, multi-agent distributed energy resources and objective control with dynamic occupant feedback. This work details the v2 environment design and provides application examples that utilize reinforcement learning to manage battery energy storage system charging/discharging cycles, vehicle-to-grid control, and thermal comfort during heat pump power modulation.

SESep 23, 2025
Reverse Engineering User Stories from Code using Large Language Models

Mohamed Ouf, Haoyu Li, Michael Zhang et al.

User stories are essential in agile development, yet often missing or outdated in legacy and poorly documented systems. We investigate whether large language models (LLMs) can automatically recover user stories directly from source code and how prompt design impacts output quality. Using 1,750 annotated C++ snippets of varying complexity, we evaluate five state-of-the-art LLMs across six prompting strategies. Results show that all models achieve, on average, an F1 score of 0.8 for code up to 200 NLOC. Our findings show that a single illustrative example enables the smallest model (8B) to match the performance of a much larger 70B model. In contrast, structured reasoning via Chain-of-Thought offers only marginal gains, primarily for larger models.