Evolutionary Mutation-based Fuzzing as Monte Carlo Tree Search
This work provides an incremental improvement in fuzzing efficiency and vulnerability discovery for software security researchers and developers.
This paper addresses seed scheduling in coverage-based greybox fuzzing (CGF) by leveraging mutation relationships among seeds to model the problem as a Monte-Carlo Tree Search (MCTS). The proposed AlphaFuzz and AlphaFuzz++ prototypes, built on AFL and AFL++, demonstrate superior performance, discovering 3 new vulnerabilities with CVEs and achieving higher code coverage on three datasets compared to state-of-the-art fuzzers.
Coverage-based greybox fuzzing (CGF) has been approved to be effective in finding security vulnerabilities. Seed scheduling, the process of selecting an input as the seed from the seed pool for the next fuzzing iteration, plays a central role in CGF. Although numerous seed scheduling strategies have been proposed, most of them treat these seeds independently and do not explicitly consider the relationships among the seeds. In this study, we make a key observation that the relationships among seeds are valuable for seed scheduling. We design and propose a "seed mutation tree" by investigating and leveraging the mutation relationships among seeds. With the "seed mutation tree", we further model the seed scheduling problem as a Monte-Carlo Tree Search (MCTS) problem. That is, we select the next seed for fuzzing by walking this "seed mutation tree" through an optimal path, based on the estimation of MCTS. We implement two prototypes, AlphaFuzz on top of AFL and AlphaFuzz++ on top of AFL++. The evaluation results on three datasets (the UniFuzz dataset, the CGC binaries, and 12 real-world binaries) show that AlphaFuzz and AlphaFuzz++ outperform state-of-the-art fuzzers with higher code coverage and more discovered vulnerabilities. In particular, AlphaFuzz discovers 3 new vulnerabilities with CVEs.