Xiaotao Feng

SE
h-index2
5papers
203citations
Novelty46%
AI Score35

5 Papers

SEJun 30, 2025
Fuzzing: Randomness? Reasoning! Efficient Directed Fuzzing via Large Language Models

Xiaotao Feng, Xiaogang Zhu, Kun Hu et al.

Fuzzing is highly effective in detecting bugs due to the key contribution of randomness. However, randomness significantly reduces the efficiency of fuzzing, causing it to cost days or weeks to expose bugs. Even though directed fuzzing reduces randomness by guiding fuzzing towards target buggy locations, the dilemma of randomness still challenges directed fuzzers. Two critical components, which are seeds and mutators, contain randomness and are closely tied to the conditions required for triggering bugs. Therefore, to address the challenge of randomness, we propose to use large language models (LLMs) to remove the randomness in seeds and reduce the randomness in mutators. With their strong reasoning and code generation capabilities, LLMs can be used to generate reachable seeds that target pre-determined locations and to construct bug-specific mutators tailored for specific bugs. We propose RandLuzz, which integrates LLMs and directed fuzzing, to improve the quality of seeds and mutators, resulting in efficient bug exposure. RandLuzz analyzes function call chain or functionality to guide LLMs in generating reachable seeds. To construct bug-specific mutators, RandLuzz uses LLMs to perform bug analysis, obtaining information such as bug causes and mutation suggestions, which further help generate code that performs bug-specific mutations. We evaluate RandLuzz by comparing it with four state-of-the-art directed fuzzers, AFLGo, Beacon, WindRanger, and SelectFuzz. With RandLuzz-generated seeds, the fuzzers achieve an average speedup ranging from 2.1$\times$ to 4.8$\times$ compared to using widely-used initial seeds. Additionally, when evaluated on individual bugs, RandLuzz achieves up to a 2.7$\times$ speedup compared to the second-fastest exposure. On 8 bugs, RandLuzz can even expose them within 60 seconds.

CRMay 12, 2021
Snipuzz: Black-box Fuzzing of IoT Firmware via Message Snippet Inference

Xiaotao Feng, Ruoxi Sun, Xiaogang Zhu et al.

The proliferation of Internet of Things (IoT) devices has made people's lives more convenient, but it has also raised many security concerns. Due to the difficulty of obtaining and emulating IoT firmware, the black-box fuzzing of IoT devices has become a viable option. However, existing black-box fuzzers cannot form effective mutation optimization mechanisms to guide their testing processes, mainly due to the lack of feedback. It is difficult or even impossible to apply existing grammar-based fuzzing strategies. Therefore, an efficient fuzzing approach with syntax inference is required in the IoT fuzzing domain. To address these critical problems, we propose a novel automatic black-box fuzzing for IoT firmware, termed Snipuzz. Snipuzz runs as a client communicating with the devices and infers message snippets for mutation based on the responses. Each snippet refers to a block of consecutive bytes that reflect the approximate code coverage in fuzzing. This mutation strategy based on message snippets considerably narrows down the search space to change the probing messages. We compared Snipuzz with four state-of-the-art IoT fuzzing approaches, i.e., IoTFuzzer, BooFuzz, Doona, and Nemesys. Snipuzz not only inherits the advantages of app-based fuzzing (e.g., IoTFuzzer, but also utilizes communication responses to perform efficient mutation. Furthermore, Snipuzz is lightweight as its execution does not rely on any prerequisite operations, such as reverse engineering of apps. We also evaluated Snipuzz on 20 popular real-world IoT devices. Our results show that Snipuzz could identify 5 zero-day vulnerabilities, and 3 of them could be exposed only by Snipuzz. All the newly discovered vulnerabilities have been confirmed by their vendors.

CRJul 20, 2020
Blockchain Meets COVID-19: A Framework for Contact Information Sharing and Risk Notification System

Jinyue Song, Tianbo Gu, Zheng Fang et al.

COVID-19 is a severe global epidemic in human history. Even though there are particular medications and vaccines to curb the epidemic, tracing and isolating the infection source is the best option to slow the virus spread and reduce infection and death rates. There are three disadvantages to the existing contact tracing system: 1. User data is stored in a centralized database that could be stolen and tampered with, 2. User's confidential personal identity may be revealed to a third party or organization, 3. Existing contact tracing systems only focus on information sharing from one dimension, such as location-based tracing, which significantly limits the effectiveness of such systems. We propose a global COVID-19 information sharing and risk notification system that utilizes the Blockchain, Smart Contract, and Bluetooth. To protect user privacy, we design a novel Blockchain-based platform that can share consistent and non-tampered contact tracing information from multiple dimensions, such as location-based for indirect contact and Bluetooth-based for direct contact. Hierarchical smart contract architecture is also designed to achieve global agreements from users about how to process and utilize user data, thereby enhancing the data usage transparency. Furthermore, we propose a mechanism to protect user identity privacy from multiple aspects. More importantly, our system can notify the users about the exposure risk via smart contracts. We implement a prototype system to conduct extensive measurements to demonstrate the feasibility and effectiveness of our system.

SEMay 4, 2019
A Feature-Oriented Corpus for Understanding, Evaluating and Improving Fuzz Testing

Xiaogang Zhu, Xiaotao Feng, Tengyun Jiao et al.

Fuzzing is a promising technique for detecting security vulnerabilities. Newly developed fuzzers are typically evaluated in terms of the number of bugs found on vulnerable programs/binaries. However,existing corpora usually do not capture the features that prevent fuzzers from finding bugs, leading to ambiguous conclusions on the pros and cons of the fuzzers evaluated. A typical example is that Driller detects more bugs than AFL, but its evaluation cannot establish if the advancement of Driller stems from the concolic execution or not, since, for example, its ability in resolving a dataset`s magic values is unclear. In this paper, we propose to address the above problem by generating corpora based on search-hampering features. As a proof-of-concept, we have designed FEData, a prototype corpus that currently focuses on four search-hampering features to generate vulnerable programs for fuzz testing. Unlike existing corpora that can only answer "how", FEData can also further answer "why" by exposing (or understanding) the reasons for the identified weaknesses in a fuzzer. The "why" information serves as the key to the improvement of fuzzers.

SEMay 2, 2019
Bug Searching in Smart Contract

Xiaotao Feng, Qin Wang, Xiaogang Zhu et al.

With the frantic development of smart contracts on the Ethereum platform, its market value has also climbed. In 2016, people were shocked by the loss of nearly $50 million in cryptocurrencies from the DAO reentrancy attack. Due to the tremendous amount of money flowing in smart contracts, its security has attracted much attention of researchers. In this paper, we investigated several common smart contract vulnerabilities and analyzed their possible scenarios and how they may be exploited. Furthermore, we survey the smart contract vulnerability detection tools for the Ethereum platform in recent years. We found that these tools have similar prototypes in software vulnerability detection technology. Moreover, for the features of public distribution systems such as Ethereum, we present the new challenges that these software vulnerability detection technologies face.