SEJun 27, 2019

Evaluating data-flow coverage in spectrum-based fault localization

arXiv:1906.11715v114 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses improving debugging efficiency for software developers by showing data-flow spectra can enhance fault localization, though it is an incremental improvement over existing SFL techniques.

The study evaluated the effectiveness of using data-flow spectra instead of control-flow spectra in Spectrum-based Fault Localization (SFL) for debugging software, finding that data-flow spectra ranked up to 50% more faults in the top-15 positions and improved effectiveness for most metrics, though with a higher execution cost averaging 353% overhead.

Background: Debugging is a key task during the software development cycle. Spectrum-based Fault Localization (SFL) is a promising technique to improve and automate debugging. SFL techniques use control-flow spectra to pinpoint the most suspicious program elements. However, data-flow spectra provide more detailed information about the program execution, which may be useful for fault localization. Aims: We evaluate the effectiveness and efficiency of ten SFL ranking metrics using data-flow spectra. Method: We compare the performance of data- and control-flow spectra for SFL using 163 faults from 5 real-world open source programs, which contain from 468 to 4130 test cases. The data- and control-flow spectra types used in our evaluation are definition-use associations (DUAs) and lines, respectively. Results: Using data-flow spectra, up to 50% more faults are ranked in the top-15 positions compared to control-flow spectra. Also, most SFL ranking metrics present better effectiveness using data-flow to inspect up to the top-40 positions. The execution cost of data-flow spectra is higher than control-flow, taking from 22 seconds to less than 9 minutes. Data-flow has an average overhead of 353% for all programs, while the average overhead for control-flow is of 102%. Conclusions: The results suggest that SFL techniques can benefit from using data-flow spectra to classify faults in better positions, which may lead developers to inspect less code to find bugs. The execution cost to gather data-flow is higher compared to control-flow, but it is not prohibitive. Moreover, data-flow spectra also provide information about suspicious variables for fault localization, which may improve the developers' performance using SFL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes