SEAug 28, 2018

Coincidental Correctness in the Defects4J Benchmark

Rawad Abou Assi, Chadi Trad, Marwan Maalouf, Wes Masri

arXiv:1808.09233v48.23 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses software testing reliability for developers, but is incremental as it extends prior work to a broader benchmark.

The study investigated coincidental correctness in the Defects4J benchmark, finding that it is prevalent and affects testing levels, with infections often nullified outside buggy methods.

Coincidental correctness (CC) arises when a defective program produces the correct output despite the fact that the defect within was exercised. Researchers have recognized the negative impact of coincidental correctness, and the authors have previously conducted a study demonstrating its prevalence in test suites. However, that study was limited to system tests and small subjects seeded with artificial defects. In this paper, we conduct a wider scope study of CC that addresses the following research questions in the context of the Defects4J benchmark: RQ1: Is CC prevalent in Defects4J? RQ2: Is CC affected by the testing levels in Defects4J? RQ3: Do CC tests induce peculiar infection paths in Defects4J? RQ4: Are the infections likely to be nullified within or outside the buggy method? ....

View on arXiv PDF Code

Similar