SEAICROSMar 12, 2025

KNighter: Transforming Static Analysis with LLM-Synthesized Checkers

arXiv:2503.09002v329 citationsh-index: 15SOSP
Originality Highly original
AI Analysis

This addresses the problem of scalable and reliable bug detection in critical systems like operating system kernels, representing a new paradigm rather than an incremental improvement.

The paper tackles the challenge of designing static analyzers for bug detection by introducing KNighter, which automatically synthesizes static analyzers from historical bug patterns using LLMs, resulting in the discovery of 92 new critical bugs in the Linux kernel, with 77 confirmed and 57 fixed.

Static analysis is a powerful technique for bug detection in critical systems like operating system kernels. However, designing and implementing static analyzers is challenging, time-consuming, and typically limited to predefined bug patterns. While large language models (LLMs) have shown promise for static analysis, directly applying them to scan large systems remains impractical due to computational constraints and contextual limitations. We present KNighter, the first approach that unlocks scalable LLM-based static analysis by automatically synthesizing static analyzers from historical bug patterns. Rather than using LLMs to directly analyze massive systems, our key insight is leveraging LLMs to generate specialized static analyzers guided by historical patch knowledge. KNighter implements this vision through a multi-stage synthesis pipeline that validates checker correctness against original patches and employs an automated refinement process to iteratively reduce false positives. Our evaluation on the Linux kernel demonstrates that KNighter generates high-precision checkers capable of detecting diverse bug patterns overlooked by existing human-written analyzers. To date, KNighter-synthesized checkers have discovered 92 new, critical, long-latent bugs (average 4.3 years) in the Linux kernel; 77 are confirmed, 57 fixed, and 30 have been assigned CVE numbers. This work establishes an entirely new paradigm for scalable, reliable, and traceable LLM-based static analysis for real-world systems via checker synthesis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes