SEFeb 20, 2018

Entropy Guided Spectrum Based Bug Localization Using Statistical Language Model

arXiv:1802.06947v112 citations
Originality Incremental advance
AI Analysis

This addresses bug localization for software developers, but it is incremental as it builds on existing spectrum-based methods.

The authors tackled the problem of bug localization in software by proposing EnSpec, which uses code entropy from a statistical language model to guide spectrum-based localization, and demonstrated that it outperforms a state-of-the-art technique on two benchmarks.

Locating bugs is challenging but one of the most important activities in software development and maintenance phase because there are no certain rules to identify all types of bugs. Existing automatic bug localization tools use various heuristics based on test coverage, pre-determined buggy patterns, or textual similarity with bug report, to rank suspicious program elements. However, since these techniques rely on information from single source, they often suffer when the respective source information is inadequate. For instance, the popular spectrum based bug localization may not work well under poorly written test suite. In this paper, we propose a new approach, EnSpec, that guides spectrum based bug localization using code entropy, a metric that basically represents naturalness of code derived from a statistical language model. Our intuition is that since buggy code are high entropic, spectrum based bug localization with code entropy would be more robust in discriminating buggy lines vs. non-buggy lines. We realize our idea in a prototype, and performed an extensive evaluation on two popular publicly available benchmarks. Our results demonstrate that EnSpec outperforms a state-of-the-art spectrum based bug localization technique.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes