57.4CRApr 7Code
Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability DiscoveryMd Shafiuzzaman, Achintya Desai, Wenbo Guo et al.
Symbolic execution detects vulnerabilities with precision, but applying it to large codebases requires harnesses that set up symbolic state, model dependencies, and specify assertions. Writing these harnesses has traditionally been a manual process requiring expert knowledge, which significantly limits the scalability of the technique. We present Static Analysis Informed and LLM-Orchestrated Symbolic Execution (SAILOR), which automates symbolic execution harness construction by combining static analysis with LLM-based synthesis. SAILOR operates in three phases: (1) static analysis identifies candidate vulnerable locations and generates vulnerability specifications; (2) an LLM uses vulnerability specifications and orchestrates harness synthesis by iteratively refining drivers, stubs, and assertions against compiler and symbolic execution feedback; symbolic execution then detects vulnerabilities using the generated harness, and (3) concrete replay validates the symbolic execution results against the unmodified project source. This design combines the scalability of static analysis, the code reasoning of LLMs, the path precision of symbolic execution, and the ground truth produced by concrete execution. We evaluate SAILOR on 10 open-source C/C++ projects totaling 6.8 M lines of code. SAILOR discovers 379 distinct, previously unknown memory-safety vulnerabilities (421 confirmed crashes). The strongest of five baselines we compare SAILOR to (agentic vulnerability detection using Claude Code with full codebase access and unlimited interaction), finds only 12 vulnerabilities. Each phase of SAILOR is critical: Without static analysis targeting confirmed vulnerabilities drop 12.2X; without iterative LLM synthesis zero vulnerabilities are confirmed; and without symbolic execution no approach can detect more than 12 vulnerabilities.
20.8SEMay 7
Assessing, Exploiting, and Mitigating Syntactic Robustness Failures in LLM-Based Code GenerationLaboni Sarker, Mara Downing, Achintya Desai et al.
Rapid advances in the field of Large Language Models (LLMs) have made LLM-based code generation an important area for investigation. An LLM-based code generator takes a prompt as input and produces code that implements the requirements specified in the prompt. Many software requirements include mathematical formulas that specify the expected behavior of the code to be generated. Given a code generation prompt that contains a mathematical formula, a reasonable expectation is that, if the formula is syntactically modified without changing its semantics, the generated code for the modified prompt should be semantically equivalent. We formalize this concept as syntactic robustness and investigate the syntactic robustness of LLMs as code generators. Our experimental assessment demonstrates that LLMs are not syntactically robust for code generation prompts with formulas, especially for the ones that require mathematical reasoning. We investigate attack strategies that can further deteriorate the syntactic robustness of LLMs. Finally, to mitigate syntactic robustness failures in LLMs, we propose a pre-processing step that uses reductions to transform formulas in prompts to a simplified form. Our experimental results demonstrate that the syntactic robustness of LLM-based code generation improves significantly using our approach, improving syntactic robustness of LLMs from 54.05% to 74.42%.
68.0SEApr 8
Program Analysis Guided LLM Agent for Proof-of-Concept GenerationAchintya Desai, Md Shafiuzzaman, Wenbo Guo et al.
Software developers frequently receive vulnerability reports that require them to reproduce the vulnerability in a reliable manner by generating a proof-of-concept (PoC) input that triggers it. Given the source code for a software project and a specific code location for a potential vulnerability, automatically generating a PoC for the given vulnerability has been a challenging research problem. Symbolic execution and fuzzing techniques require expert guidance and manual steps and face scalability challenges for PoC generation. Although recent advances in LLMs have increased the level of automation and scalability, the success rate of PoC generation with LLMs remains quite low. In this paper, we present a novel approach called Program Analysis Guided proof of concept generation agENT (PAGENT) that is scalable and significantly improves the success rate of automated PoC generation compared to prior results. PAGENT integrates lightweight and rule-based static analysis phases for providing static analysis guidance and sanitizer-based profiling and coverage information for providing dynamic analysis guidance with a PoC generation agent. Our experiments demonstrate that the resulting hybrid approach significantly outperforms the prior top-performing agentic approach by 132% for the PoC generation task.
CRJan 12, 2021
A Survey of Privacy-Preserving Techniques for Encrypted Traffic Inspection over Network MiddleboxesGeong Sen Poh, Dinil Mon Divakaran, Hoon Wei Lim et al.
Middleboxes in a computer network system inspect and analyse network traffic to detect malicious communications, monitor system performance and provide operational services. However, encrypted traffic hinders the ability of middleboxes to perform such services. A common practice in addressing this issue is by employing a "Man-in-the-Middle" (MitM) approach, wherein an encrypted traffic flow between two endpoints is interrupted, decrypted and analysed by the middleboxes. The MitM approach is straightforward and is used by many organisations, but there are both practical and privacy concerns. Due to the cost of the MitM appliances and the latency incurred in the encrypt-decrypt processes, enterprises continue to seek solutions that are less costly. There were discussion on the many efforts required to configure MitM. Besides, MitM violates end-to-end privacy guarantee, raising privacy concerns and issues on compliance especially with the rising awareness on user privacy. Furthermore, some of the MitM implementations were found to be flawed. Consequently, new practical and privacy-preserving techniques for inspection over encrypted traffic were proposed. We examine them to compare their advantages, limitations and challenges. We categorise them into four main categories by defining a framework that consist of system architectures, use cases, trust and threat models. These are searchable encryption, access control, machine learning and trusted hardware. We first discuss the man-in-the-middle approach as a baseline, then discuss in details each of them, and provide an in-depth comparisons of their advantages and limitations. By doing so we describe practical constraints, advantages and pitfalls towards adopting the techniques. We also give insights on the gaps between research work and industrial deployment, which leads us to the discussion on the challenges and research directions.