Leveraging Models to Reduce Test Cases in Software Repositories
This addresses the bottleneck of slow test case reduction for software developers and testers, offering an incremental improvement over existing techniques.
The paper tackled the problem of time-consuming test case reduction by proposing a model-guided approach that predicts semantic validity to skip unlikely candidates, resulting in a 30% geomean improvement in reduction time and reducing removal trials by 14% to 61% with 77% average precision.
Given a failing test case, test case reduction yields a smaller test case that reproduces the failure. This process can be time consuming due to repeated trial and error with smaller test cases. Current techniques speed up reduction by only exploring syntactically valid candidates, but they still spend significant effort on semantically invalid candidates. In this paper, we propose a model-guided approach to speed up test case reduction. The approach trains a model of semantic properties driven by syntactic test case properties. By using this model, we can skip testing even syntactically valid test case candidates that are unlikely to succeed. We evaluate this model-guided reduction on a suite of 14 large fuzzer-generated C test cases from the bug repositories of two well-known C compilers, GCC and Clang. Our results show that with an average precision of 77%, we can decrease the number of removal trials by 14% to 61%. We observe a 30% geomean improvement in reduction time over the state of the art technique while preserving similar reduction power.