SEFeb 13, 2018

Replication studies considered harmful

arXiv:1802.04580v119 citations
AI Analysis

This addresses the inefficiency of replication practices in empirical software engineering research, highlighting a potential waste of resources.

The paper argues that replication studies in software engineering often fail to contribute meaningful knowledge due to wide prediction intervals, making most replications confirmatory but negligible, and advocates for meta-analysis as a more effective alternative.

CONTEXT: There is growing interest in establishing software engineering as an evidence-based discipline. To that end, replication is often used to gain confidence in empirical findings, as opposed to reproduction where the goal is showing the correctness, or validity of the published results. OBJECTIVE: To consider what is required for a replication study to confirm the original experiment and apply this understanding in software engineering. METHOD: Simulation is used to demonstrate why the prediction interval for confirmation can be surprisingly wide. This analysis is applied to three recent replications. RESULTS: It is shown that because the prediction intervals are wide, almost all replications are confirmatory, so in that sense there is no 'replication crisis', however, the contributions to knowledge are negligible. CONCLUSIONS: Replicating empirical software engineering experiments, particularly if they are under-powered or under-reported, is a waste of scientific resources. By contrast, meta-analysis is strongly advocated so that all relevant experiments are combined to estimate the population effect.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes