CRApr 14

Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

Penghui Li, Songchen Yao, Josef Sarfati Korich, Changhua Luo, Jianjia Yu, Yinzhi Cao, Junfeng Yang

arXiv:2504.160577.85 citationsh-index: 6

Predicted impact top 50% in CR · last 90 daysOriginality Incremental advance

AI Analysis

Automates the labor-intensive creation of vulnerability patterns for static analysis, reducing effort from weeks to hours while matching expert quality.

MoCQ uses LLMs to automatically generate vulnerability detection patterns for static analysis, achieving performance comparable to expert-crafted patterns in hours instead of weeks. It discovered 46 missed patterns and 25 unknown vulnerabilities across 12 types and 4 languages.

In this work, we present MoCQ, a neuro-symbolic static analysis framework that leverages large language models (LLMs) to automatically generate vulnerability detection patterns. This approach combines the precision and scalability of pattern-based static analysis with the semantic understanding and automation capabilities of LLMs. MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. We evaluated MoCQ on 12 vulnerability types across four languages (C/C++, Java, PHP, JavaScript). MoCQ achieves detection performance comparable to expert-developed patterns while requiring only hours of generation versus weeks of manual effort. Notably, MoCQ uncovered 46 new vulnerability patterns that security experts had missed and discovered 25 previously unknown vulnerabilities in real-world applications. MoCQ also outperforms prior approaches with stronger analysis capabilities and broader applicability.

View on arXiv PDF

Similar