SE PLFeb 25, 2019

A Systematic Impact Study for Fuzzer-Found Compiler Bugs

Michaël Marcozzi, Qiyi Tang, Alastair F. Donaldson, Cristian Cadar

arXiv:1902.09334v38.553 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the gap in understanding the real-world consequences of compiler bugs for software developers and users, but it is incremental as it builds on existing fuzzing research.

The study tackled the problem of assessing the practical impact of fuzzer-found compiler bugs on real-world applications by analyzing miscompilation bugs in the Clang/LLVM compiler, finding that almost half of these bugs propagate to binaries but rarely affect syntax and cause only two test suite failures.

Despite much recent interest in compiler randomized testing (fuzzing), the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) how much such changes do cause regression test suite failures and could be used to manually trigger divergences during execution. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or by formal verification. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for some packages, but rarely affect their syntax and cause two failures in total when running their test suites. User-reported and formal verification bugs do not exhibit a higher impact, with less frequently triggered bugs and one test failure. Our manual analysis of a selection of bugs, either fuzzer-found or not, suggests that none can easily trigger a runtime divergence on the packages considered in the analysis, and that in general they affect only corner cases.

View on arXiv PDF

Similar