Kaylea Champion

5papers

31citations

Novelty26%

AI Score19

Ranked #193,326 of 201,018 authors (top 96%)#2,801 in SE (top 82%)

5 Papers

SEFeb 27, 2021Code

Underproduction: An Approach for Measuring Risk in Open Source Software

Kaylea Champion, Benjamin Mako Hill

The widespread adoption of Free/Libre and Open Source Software (FLOSS) means that the ongoing maintenance of many widely used software components relies on the collaborative effort of volunteers who set their own priorities and choose their own tasks. We argue that this has created a new form of risk that we call 'underproduction' which occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced. We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset from the Debian GNU/Linux distribution that includes 21,902 source packages and the full history of 461,656 bugs. We draw on this application to present two experiments: (1) a demonstration of how our technique can be used to identify at-risk software packages in a large FLOSS repository and (2) a validation of these results using an alternate indicator of package risk. Our analysis demonstrates both the utility of our approach and reveals the existence of widespread underproduction in a range of widely-installed software components in Debian.

HCFeb 11, 2022

The Risks, Benefits, and Consequences of Prepublication Moderation: Evidence from 17 Wikipedia Language Editions

Chau Tran, Kaylea Champion, Benjamin Mako Hill et al.

Many online communities rely on postpublication moderation where contributors, even those that are perceived as being risky, are allowed to publish material immediately and where moderation takes place after the fact. An alternative arrangement involves moderating content before publication. A range of communities have argued against prepublication moderation by suggesting that it makes contributing less enjoyable for new members and that it will distract established community members with extra moderation work. We present an empirical analysis of the effects of a prepublication moderation system called FlaggedRevs that was deployed by several Wikipedia language editions. We used panel data from 17 large Wikipedia editions to test a series of hypotheses related to the effect of the system on activity levels and contribution quality. We found that the system was very effective at keeping low-quality contributions from ever becoming visible. Although there is some evidence that the system discouraged participation among users without accounts, our analysis suggests that the system's effects on contribution volume and quality were moderate at most. Our findings imply that concerns regarding the major negative effects of prepublication moderation systems on contribution quality and project productivity may be overstated.

SEJul 29, 2021

Qualities of Quality: A Tertiary Review of Software Quality Measurement Research

Kaylea Champion, Sejal Khatri, Benjamin Mako Hill

This paper presents a tertiary review of software quality measurement research. To conduct this review, we examined an initial dataset of 7,811 articles and found 75 relevant and high-quality secondary analyses of software quality research. Synthesizing this body of work, we offer an overview of perspectives, measurement approaches, and trends. We identify five distinct perspectives that conceptualize quality as heuristic, as maintainability, as a holistic concept, as structural features of software, and as dependability. We also identify three key challenges. First, we find widespread evidence of validity questions with common measures. Second, we observe the application of machine learning methods without adequate evaluation. Third, we observe the use of aging datasets. Finally, from these observations, we sketch a path toward a theoretical framework that will allow software engineering researchers to systematically confront these weaknesses while remaining grounded in the experiences of developers and the real world in which code is ultimately deployed.

SIJul 4, 2020

Characterizing Online Vandalism: A Rational Choice Perspective

Kaylea Champion

What factors influence the decision to vandalize? Although the harm is clear, the benefit to the vandal is less clear. In many cases, the thing being damaged may itself be something the vandal uses or enjoys. Vandalism holds communicative value: perhaps to the vandal themselves, to some audience at whom the vandalism is aimed, and to the general public. Viewing vandals as rational community participants despite their antinormative behavior offers the possibility of engaging with or countering their choices in novel ways. Rational choice theory (RCT) as applied in value expectancy theory (VET) offers a strategy for characterizing behaviors in a framework of rational choices, and begins with the supposition that subject to some weighting of personal preferences and constraints, individuals maximize their own utility by committing acts of vandalism. This study applies the framework of RCT and VET to gain insight into vandals' preferences and constraints. Using a mixed-methods analysis of Wikipedia, I combine social computing and criminological perspectives on vandalism to propose an ontology of vandalism for online content communities. I use this ontology to categorize 141 instances of vandalism and find that the character of vandalistic acts varies by vandals' relative identifiability, policy history with Wikipedia, and the effort required to vandalize.

SIApr 8, 2019

Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor

Chau Tran, Kaylea Champion, Andrea Forte et al.

User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.