SEFeb 26, 2019

Ahead of Time Mutation Based Fault Localisation using Statistical Inference

Jinhan Kim, Gabin An, Robert Feldt, Shin Yoo

arXiv:1902.09729v25.011 citations

Originality Incremental advance

AI Analysis

This addresses the practical adoption challenge of MBFL for software developers by reducing analysis costs, though it is incremental as it builds on existing MBFL methods.

The paper tackles the high cost of mutation-based fault localization (MBFL) by introducing SIMFL, which performs mutation analysis ahead of time, allowing cost amortization. Empirical results show SIMFL localizes 55% of faults at the top rank and 78% within the top five, outperforming existing MBFL techniques while reducing costs through sampling.

Mutation analysis can effectively capture the dependency between source code and test results. This has been exploited by Mutation Based Fault Localisation (MBFL) techniques. However, MBFL techniques suffer from the need to expend the high cost of mutation analysis after the observation of failures, which may present a challenge for its practical adoption. We introduce SIMFL (Statistical Inference for Mutation-based Fault Localisation), an MBFL technique that allows users to perform the mutation analysis in advance before a failure is observed, allowing the amortisation of the analysis cost. SIMFL uses mutants as artificial faults and aims to learn the failure patterns among test cases against different locations of mutations. Once a failure is observed, SIMFL requires either almost no or very small additional cost for analysis, depending on the used inference model. An empirical evaluation using Defects4J shows that SIMFL can successfully localise up to 113 out of 203 studied faults (55%) at the top, and 159 (78%) faults within the top five, significantly outperforming existing MBFL techniques while using the results of mutation analysis that has been undertaken before the test failure. The amortised cost of mutation analysis can be further reduced by mutation sampling: SIMFL retains 80% of its localisation accuracy at the top rank when using only 10% of generated mutants, compared to results obtained without sampling.

View on arXiv PDF

Similar