FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing Coverage
This addresses a bottleneck in fuzz testing for software security by improving coverage depth, though it is an incremental enhancement to existing tools.
The paper tackles the problem of limited program coverage in AFL fuzz testing by proposing FairFuzz, which prioritizes inputs that exercise rare program branches and adjusts mutations to target these branches, resulting in significant coverage increases or faster achievement of high coverage compared to state-of-the-art AFL versions in benchmarks.
In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its ease-of-use and bug-finding power. However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage. We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps. First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test. Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program. We conduct evaluation on real-world programs against state-of-the-art versions of AFL, thoroughly repeating experiments to get good measures of variability. We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to state-of-the-art versions of AFL, while on others it achieves high program coverage at a significantly faster rate.