David Korczynski

CR
3papers
2citations
Novelty40%
AI Score38

3 Papers

87.5PLMay 26Code
Agentic Separation Logic Specification Synthesis

Tarun Suresh, David Korczynski, Julien Vanegue

Specification synthesis, the task of automatically inferring formal specifications from program implementations and natural language, is important for refactoring, transpilation, optimization, and verification, yet remains an open challenge for large C++ repositories. Existing LLM-based approaches fail to simultaneously scale to such repositories, produce specifications expressive enough to capture systems-code features such as dynamic memory and heap-allocated data structures, and systematically validate those specifications to rule out incorrect candidates. We present Spec-Agent, an agentic system for synthesizing expressive, well-validated specifications across large C++ codebases. Spec-Agent targets a ladder of specification languages: propositional logic, first-order logic, propositional separation logic, and first-order separation logic. For each function, Spec-Agent uses static analysis and runtime heap tracing to select the appropriate target specification language, generalizes existing functional tests into fuzz harnesses, and iteratively refines LLM-generated candidates via counterexample-guided feedback. We evaluate Spec-Agent on open source C++ codebases comprising millions of lines of code. Spec-Agent synthesizes valid specifications for 85% of target functions, with no false positives observed under fuzzing and expert validation, outperforming Claude Code Opus 4.6 at 10x lower token cost.

CRAug 27, 2019
A characterisation of system-wide propagation in the malware landscape

David Korczynski

System-wide propagation is frequently observed in malware, and there are several resources, like blog posts and similar, that detail some of the techniques used. However, there is currently no thorough study on the subject at large, and the full extent of system-wide malware propagation remains unknown. In this paper, we perform a systematic study on many real-world samples to comprehensively characterise system-wide propagation within the malware landscape and the goal is to use detailed and precise analyses to derive high-level views. We achieve this by collecting a diverse set of malware samples, analyse them in our Minerva malware analysis framework and then extract vast amounts of statistics about the results. We use these results to provide an in-depth discussion centred on four main research questions.

CRAug 24, 2019
Precise system-wide concatic malware unpacking

David Korczynski

Run time packing is a common approach malware use to obfuscate their payloads, and automatic unpacking is, therefore, highly relevant. The problem has received much attention, and so far, solutions based on dynamic analysis have been the most successful. Nevertheless, existing solutions lack in several areas, both conceptually and architecturally, because they focus on a limited part of the unpacking problem. These limitations significantly impact their applicability, and current unpackers have, therefore, experienced limited adoption. In this paper, we introduce a new tool, called Minerva, for effective automatic unpacking of malware samples. Minerva introduces a unified approach to precisely uncover execution waves in a packed malware sample and produce PE files that are well-suited for follow-up static analysis. At the core, Minerva deploys a novel information flow model of system-wide dynamically generated code, precise collection of API calls and a new approach for merging execution waves and API calls. Together, these novelties amplify the generality and precision of automatic unpacking and make the output of Minerva highly usable. We extensively evaluate Minerva against synthetic and real-world malware samples and show that our techniques significantly improve on several aspects compared to previous work.