CRMay 20

A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox

Zhiyong Sui, Lamine Noureddine, Mst Eshita Khatun, Sideeq Bello, Justin Woodring, Aisha Ali-Gombe

arXiv:2605.2182167.8

Predicted impact top 21% in CR · last 90 daysOriginality Incremental advance

AI Analysis

For malware analysts, ABLE automates the generation of bypass rules, addressing the scalability challenge of manually reverse-engineering evasion techniques.

ABLE uses LLMs to automatically generate YARA rules for bypassing sandbox evasion checks, achieving a 79% bypass success rate on 334 real-world malware samples and identifying 47% more malware family classifications than existing platforms.

Sandbox evasion remains a critical challenge for automated malware analysis, as modern malware employs environment checks to detect analysis platforms and suppress malicious behavior. Existing approaches rely on manually crafted bypass rules that require deep reverse engineering of each evasion mechanism -an approach that cannot scale against rapidly evolving evasion techniques. In this paper, we leverage large language models (LLMs) to automatically generate YARA rules that bypass evasion checks in sandbox environments. We propose ABLE, which analyzes execution traces from malware terminated due to potentially evasive behavior and employs multiple reasoning strategies to generate targeted bypass rules. To address syntactic errors and improve the efficacy of the bypass rules in the LLM outputs, we introduce an auto-sanitization pipeline and feedback-driven iterative refinement. We evaluate ABLE on 334 real-world malware samples across four open-weight LLMs. ABLE achieves a 79% bypass success rate, with iterative refinement contributing 29.5% of successful cases. Compared to existing analysis platforms, ABLE identifies 47% more malware family classifications and exposes previously hidden behaviors.

View on arXiv PDF

Similar