Xiaotian Ma

11.4SEMay 19

DRReduce: Enhancing Syntax-Guided Program Reduction with Dependency Reconstruction

Qiong Feng, Xiaotian Ma, Yongqiang Tian et al.

Program reduction is a technique for simplifying large, failure-inducing programs into minimal reproducible test cases. Language-specific tools such as CReduce achieve strong performance by leveraging deep semantic knowledge of C/C++, but are tightly coupled to a single language family. Language-agnostic reducers such as Perses address this by applying syntax-guided search across any grammar, yet share a fundamental limitation: deleting a node or subtree in isolation often breaks semantic coherence causing the property checker to reject the deletion and forcing the reducer to backtrack, limiting overall reduction effectiveness and efficiency. In this paper, we propose DRReduce, a framework that bridges this gap by augmenting language-agnostic syntactic reduction with a lightweight semantic layer: dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker. DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction, and delegates further minimization to a syntax-guided reducer. We implement DRReduce for C and Java and evaluate it on real-world bug-triggering programs. Compared to SOTA syntax-guided reducers, DRReduce achieves average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, while completing reduction faster on the majority of programs. Compared to language-specific tools, DRReduce achieves results comparable to CReduce and Latra without any language-specific transformation rules, at 3.3x and 1.2x higher efficiency than CReduce and Latra on average, respectively. An ablation study confirms that dependency reconstruction reduces query invocations by 80.2%, reduction time by 58.7%, and final token count by over 55.1%.

SEDec 5, 2024Code

Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair

Qiong Feng, Xiaotian Ma, Jiayi Sheng et al.

LLMs have garnered considerable attention for their potential to streamline Automated Program Repair (APR). LLM-based approaches can either insert the correct code or directly generate patches when provided with buggy methods. However, most of LLM-based APR methods rely on a single type of software information, without fully leveraging different software artifacts. Despite this, many LLM-based approaches do not explore which specific types of information best assist in APR. Addressing this gap is crucial for advancing LLM-based APR techniques. We propose DEVLoRe to use issue content (description and message) and stack error traces to localize buggy methods, then rely on debug information in buggy methods and issue content and stack error to localize buggy lines and generate plausible patches which can pass all unit tests. The results show that while issue content is particularly effective in assisting LLMs with fault localization and program repair, different types of software artifacts complement each other. By incorporating different artifacts, DEVLoRe successfully locates 49.3% and 47.6% of single and non-single buggy methods and generates 56.0% and 14.5% plausible patches for the Defects4J v2.0 dataset, respectively. This outperforms current state-of-the-art APR methods. Furthermore, we re-implemented and evaluated our framework, demonstrating its effectiveness in its effectiveness in resolving 9 unique issues compared to other state-of-the-art frameworks using the same or more advanced models on SWE-bench Lite.We also discussed whether a leading framework for Python code can be directly applied to Java code, or vice versa. The source code and experimental results of this work for replication are available at https://github.com/XYZboom/DEVLoRe.

Xiaotian Ma

2 Papers