SEAug 28, 2018
Coincidental Correctness in the Defects4J BenchmarkRawad Abou Assi, Chadi Trad, Marwan Maalouf et al.
Coincidental correctness (CC) arises when a defective program produces the correct output despite the fact that the defect within was exercised. Researchers have recognized the negative impact of coincidental correctness, and the authors have previously conducted a study demonstrating its prevalence in test suites. However, that study was limited to system tests and small subjects seeded with artificial defects. In this paper, we conduct a wider scope study of CC that addresses the following research questions in the context of the Defects4J benchmark: RQ1: Is CC prevalent in Defects4J? RQ2: Is CC affected by the testing levels in Defects4J? RQ3: Do CC tests induce peculiar infection paths in Defects4J? RQ4: Are the infections likely to be nullified within or outside the buggy method? ....
SEAug 28, 2018
CFAAR: Control Flow Alteration to Assist RepairChadi Trad, Rawad Abou Assi, Wes Masri et al.
We present CFAAR, a program repair assistance technique that operates by selectively altering the outcome of suspicious predicates in order to yield expected behavior. CFAAR is applicable to defects that are repairable by negating predicates under specific conditions. CFAAR proceeds as follows: 1) it identifies predicates such that negating them at given instances would make the failing tests exhibit correct behavior; 2) for each candidate predicate, it uses the program state information to build a classifier that dictates when the predicate should be negated; 3) for each classifier, it leverages a Decision Tree to synthesize a patch to be presented to the developer. We evaluated our toolset using 149 defects from the IntroClass and Siemens benchmarks. CFAAR identified 91 potential candidate defects and generated plausible patches for 41 of them. Twelve of the patches are believed to be correct, whereas the rest provide repair assistance to the developer.
SEAug 24, 2018
Substate Profiling for Effective Test Suite ReductionChadi Trad, Rawad Abou Assi, Wes Masri
Test suite reduction (TSR) aims at removing redundant test cases from regression test suites. A typical TSR approach ensures that structural profile elements covered by the original test suite are also covered by the reduced test suite. It is plausible that structural profiles might be unable to segregate failing runs from passing runs, which diminishes the effectiveness of TSR in regard to defect detection. This motivated us to explore state profiles, which are based on the collective values of program variables. This paper presents Substate Profiling, a new form of state profiling that enhances existing profile-based analysis techniques such as TSR and coverage-based fault localization. Compared to current approaches for capturing program states, Substate Profiling is more practical and finer grained. We evaluated our approach using thirteen multi-fault subject programs comprising 53 defects. Our study involved greedy TSR using Substate profiles and four structural profiles, namely, basic-block, branch, def-use pair, and the combination of the three. For the majority of the subjects, Substate Profiling detected considerably more defects with a comparable level of reduction. Also, Substate profiles were found to be complementary to structural profiles in many cases, thus, combining both types is beneficial.
SEMay 2, 2017
ACDC: Altering Control Dependence Chains for Automated Patch GenerationRawad Abou Assi, Chadi Trad, Wes Masri
Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%.