Nigel Stanger

SE
3papers
21citations
Novelty20%
AI Score38

3 Papers

SEJun 4, 2021Code
Influence of Roles in Decision-Making during OSS Development -- A Study of Python

Pankajeshwara Nand Sharma, Bastin Tony Roy Savarimuthu, Nigel Stanger

Governance has been highlighted as a key factor in the success of an Open Source Software (OSS) project. It is generally seen that in a mixed meritocracy and autocracy governance model, the decision-making (DM) responsibility regarding what features are included in the OSS is shared among members from select roles; prominently the project leader. However, less examination has been made whether members from these roles are also prominent in DM discussions and how decisions are made, to show they play an integral role in the success of the project. We believe that to establish their influence, it is necessary to examine not only discussions of proposals in which the project leader makes the decisions, but also those where others make the decisions. Therefore, in this study, we examine the prominence of members performing different roles in: (i) making decisions, (ii) performing certain social roles in DM discussions (e.g., discussion starters), (iii) contributing to the OSS development social network through DM discussions, and (iv) how decisions are made under both scenarios. We examine these aspects in the evolution of the well-known Python project. We carried out a data-driven longitudinal study of their email communication spanning 20 years, comprising about 1.5 million emails. These emails contain decisions for 466 Python Enhancement Proposals (PEPs) that document the language's evolution. Our findings make the influence of different roles transparent to future (new) members, other stakeholders, and more broadly, to the OSS research community.

SEFeb 10, 2021Code
Extracting Rationale for Open Source Software Development Decisions -- A Study of Python Email Archives

Pankajeshwara Nand Sharma, Bastin Tony Roy Savarimuthu, Nigel Stanger

A sound Decision-Making (DM) process is key to the successful governance of software projects. In many Open Source Software Development (OSSD) communities, DM processes lie buried amongst vast amounts of publicly available data. Hidden within this data lie the rationale for decisions that led to the evolution and maintenance of software products. While there have been some efforts to extract DM processes from publicly available data, the rationale behind how the decisions are made have seldom been explored. Extracting the rationale for these decisions can facilitate transparency (by making them known), and also promote accountability on the part of decision-makers. This work bridges this gap by means of a large-scale study that unearths the rationale behind decisions from Python development email archives comprising about 1.5 million emails. This paper makes two main contributions. First, it makes a knowledge contribution by unearthing and presenting the rationale behind decisions made. Second, it makes a methodological contribution by presenting a heuristics-based rationale extraction system called Rationale Miner that employs multiple heuristics, and follows a data-driven, bottom-up approach to infer the rationale behind specific decisions (e.g., whether a new module is implemented based on core developer consensus or benevolent dictator's pronouncement). Our approach can be applied to extract rationale in other OSSD communities that have similar governance structures.

38.2SEMay 5
Geographic Variation in Stack Overflow Code Quality: Evidence from a Cross-Regional Study of Coding Practices

Elijah Zolduoarrati, Sherlock A. Licorish, Nigel Stanger

Developers frequently reuse Stack Overflow code snippets, yet the quality of these snippets remains unevenly understood, particularly across programming languages and geographic contexts. This study investigates code quality in Stack Overflow answers from contributors located in the United States, focusing on SQL, JavaScript, Python, Ruby, and Java snippets. We evaluate four quality dimensions: reliability, readability, performance, and security. Using language-specific linting and static analysis tools, we quantify violations across states and cities, compute violation densities to enable fair regional comparison, and examine relationships between code quality and state-level diversity indicators. We further conduct inductive content analysis on code snippets from California, Utah, and North Dakota to identify qualitative patterns in code quality violations. Results show that readability violations are the most prevalent across all languages, followed by reliability, performance, and security. Common issues include improper whitespace, inconsistent formatting, program-flow errors, inefficient resource use, unsanitised inputs, and insecure dynamic evaluation. Regional analysis indicates that major technology hubs produce more parsable snippets but do not necessarily exhibit higher violation densities. States with broader access to computing devices, Internet subscriptions, higher income, and more equitable wealth distribution tend to show fewer code quality violations. Qualitative findings suggest that established technology regions often produce more complex violations, while less mature technology regions display more fundamental errors. These findings highlight the socio-technical nature of code quality in community question-answering platforms and suggest that developers should exercise caution when reusing online code snippets.