Qinxiang Cao

PL
h-index16
6papers
29citations
Novelty50%
AI Score48

6 Papers

67.8SEMar 12
Neuro-Symbolic Generation and Validation of Memory-Aware Formal Function Specifications

Liao Zhang, Tong Chen, Xiwei Wu et al.

Formal verification of memory-manipulating programs critically depends on precise function specifications that capture memory states written by experts. This requirement has become a major bottleneck as large language models (LLMs) increasingly generate low-level systems code whose correctness cannot be assumed. To enable scalable formal verification, we focus exclusively on function specification generation, deliberately avoiding the synthesis of complex loop invariants that are central to traditional verification pipelines. We propose a neuro-symbolic framework for automatically generating memory-aware formal function specifications for C programs from natural language problem descriptions and function signatures. The pipeline first produces candidate specifications via in-context learning, and then iteratively refines them using compiler diagnostics from symbolic provers and the verification toolchain. In particular, we validate candidate specifications by constructing a proof for the negation of the specification with concrete examples, enabling machine-checked rejection of plausible-but-incorrect specifications. To support systematic evaluation, we introduce LeetCode-C-Spec, a new benchmark of 200 C programming problems for generating memory-aware formal function specifications. Experiments show that iterative refinement substantially improves syntactic validity, while symbolic prover-based refutation significantly enhances correctness assessment by filtering false positives that LLM-only judges frequently accept. Our results demonstrate that combining neural generation with symbolic feedback provides an effective approach to formal specification synthesis for memory-safe systems software.

79.2PLApr 24
QCP: A Practical Separation Logic-based C Program Verification Tool

Xiwei Wu, Yueyang Feng, Xiaoyang Lu et al.

As software systems increase in size and complexity dramatically, ensuring their correctness, security, and reliability becomes an increasingly formidable challenge. Despite significant advancements in verification techniques and tools, their practical application to complex, real-world systems is often hindered by critical gaps in both automation and expressiveness. To address these difficulties, this paper presents \textbf{Qualified C Programming Verifier (QCP)}, a novel verification tool that integrates annotation-based automatic verification with interactive proving using Rocq. QCP employs symbolic execution and a separation logic entailment solver to automatically discharge many verification obligations, while deferring more complex obligations to Rocq for manual proof. Furthermore, QCP includes a VS Code extension designed to enhance proof efficiency and support a deeper understanding of both the program behavior and verification outcomes.

88.6PLMar 29
Denotation-based Compositional Compiler Verification

Zhang Cheng, Jiyang Wu, Di Wang et al.

A desired but challenging property of compiler verification is compositionality, in the sense that the compilation correctness of a program can be deduced incrementally from that of its substructures ranging from statements, functions, and modules. This article proposes a novel compiler verification framework based on denotational semantics for better compositionality, compared to previous approaches based on small-step operational semantics and simulation theories. Our denotational semantics is defined by semantic functions that map a syntactic component to a semantic domain composed of multiple behavioral \emph{sets}, with compiler correctness established through behavior refinement between the semantic domains of the source and target programs. The main contributions of this article include proposing a denotational semantics for open modules, a novel semantic linking operator, and a refinement algebra that unifies various behavior refinements, making compiler verification structured and compositional. Furthermore, our formalization captures the full meaning of a program and bridges the gap between traditional power-domain-based denotational semantics and the practical needs of compiler verification. We apply our denotation-based framework to verify the front-end of CompCert and typical optimizations on simple prototypes of imperative languages. Our results demonstrate that the compositionality from sub-statements to statements, from functions to modules, and from modules to the whole program can be effectively achieved.

AIMay 7, 2025
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

Qi Liu, Xinhao Zheng, Renqiu Xia et al.

As a seemingly self-explanatory task, problem-solving has been a significant component of science and engineering. However, a general yet concrete formulation of problem-solving itself is missing. With the recent development of AI-based problem-solving agents, the demand for process-level verifiability is rapidly increasing yet underexplored. To fill these gaps, we present a principled formulation of problem-solving as a deterministic Markov decision process; a novel framework, FPS (Formal Problem-Solving), which utilizes existing FTP (formal theorem proving) environments to perform process-verified problem-solving; and D-FPS (Deductive FPS), decoupling solving and answer verification for better human-alignment. The expressiveness, soundness and completeness of the frameworks are proven. We construct three benchmarks on problem-solving: FormalMath500, a formalization of a subset of the MATH500 benchmark; MiniF2F-Solving and PutnamBench-Solving, adaptations of FTP benchmarks MiniF2F and PutnamBench. For faithful, interpretable, and human-aligned evaluation, we propose RPE (Restricted Propositional Equivalence), a symbolic approach to determine the correctness of answers by formal verification. We evaluate four prevalent FTP models and two prompting methods as baselines, solving at most 23.77% of FormalMath500, 27.47% of MiniF2F-Solving, and 0.31% of PutnamBench-Solving.

PLOct 11, 2025
Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis

Zhechong Huang, Zhao Zhang, Ruyi Ji et al.

Language models have shown remarkable proficiency in code generation; nevertheless, ensuring type correctness remains a challenge. Although traditional methods, such as constrained decoding, alleviate this problem by externally rejecting untypable code, the model itself does not effectively learn type reasoning internally, which ultimately limits its overall performance. This paper introduces TyFlow, a novel system that internalizes type reasoning within code generation to guide the model to learn the type system. The core of our approach is a novel type-guided program synthesis system that maintains an isomorphism between type derivation trees and synthesis derivation trees, enabling a new code representation based on synthesis decision sequences rather than traditional text-based token sequences. By offloading the complexity of type system learning to the representation itself, models can redirect their computational resources toward higher-level program semantics. Our evaluation shows that TyFlow not only eliminates type errors but also significantly improves functional correctness, highlighting the importance of aligning LMs with type systems internally.

LGFeb 25, 2022
Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Ting Long, Yutong Xie, Xianyu Chen et al.

Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural networks based on source code structures. However, such methods often focus on the syntax and consider only one single perspective of programs, limiting the representation power of models. This paper proposes a multi-view graph (MVG) program representation method. MVG pays more attention to code semantics and simultaneously includes both data flow and control flow as multiple views. These views are then combined and processed by a graph neural network (GNN) to obtain a comprehensive program representation that covers various aspects. We thoroughly evaluate our proposed MVG approach in the context of algorithm detection, an important and challenging subfield of PLP. Specifically, we use a public dataset POJ-104 and also construct a new challenging dataset ALG-109 to test our method. In experiments, MVG outperforms previous methods significantly, demonstrating our model's strong capability of representing source code.