45.2CRMay 29
A Core-Structure-Based Automated Analysis Tool for Commercial Virtualization Obfuscation DeobfuscationWanju Kim, Seoksu Lee, Eun-Sun Cho
Virtualization obfuscation is a more powerful obfuscation technique compared to other obfuscation methods, and as it is increasingly being applied to malware, it demands significant effort and time from analysts. This study analyzes virtualization obfuscation and proposes a tool called VMPredator that automatically extracts semantic units. The proposed tool performs various analyses including memory analysis and trace analysis, while minimizing dependency on the specific internal structure of virtual machines in order to handle diverse forms of virtualization obfuscation that existing tools are unable to process. Experimental results demonstrate that the length of obfuscated programs was reduced by approximately 85%, and it was verified through validation that small-scale programs were fully restored to semantics identical to the original.
5.5CRMay 11
Towards LLM-Based Analysis of Virtualization-Obfuscated Code through Automated Data GenerationSangjun An, Hyeyeon Park, Yejin Son et al.
Virtualization-based obfuscation produces extremely large and structurally complex binaries, posing challenges for LLM-based analysis due to input size limits and the need for large-scale labeled data. We address this by focusing on structural rather than full semantic analysis. Obfuscated binaries are decomposed into the largest semantically coherent units that fit within LLM constraints and are labeled according to their structural roles. We implement a static analysis framework to automate labeling and enable large-scale dataset generation. Our prototype shows strong performance on real-world virtualization obfuscators.