Xinyue Zuo

h-index5
2papers

2 Papers

39.6SEMay 17
Event-B Agent: Towards LLM Agent for Formal Model Synthesis and Repair

Hongshu Wang, Xinyue Zuo, Yuhan Sun et al.

Building software that is correct by construction is a long-standing goal in software engineering, as it ensures reliability during design and development rather than after deployment. Formal methods realize this vision by enabling the expression of system behavior and requirements in mathematics, thereby guaranteeing correctness through formal verification, including theorem proving and model checking. However, the steep learning curve and demand for mathematical expertise hinder the widespread adoption of formal methods. Large language models (LLMs) have recently shown promise in bridging this gap through autoformalization. However, existing LLM-based approaches are largely limited to isolated tasks, such as theorem proving without formalization or model synthesis with insufficient verification. While valuable, these efforts do not fully exploit the potential of a more comprehensive framework in which models and proofs evolve together, a process that closely reflects real-world development practice. To address this gap, we propose Event-B Agent, a novel framework inspired by the interleaved nature of software design. Given natural language requirements, Event-B Agent constructs an initial model and iteratively repairs and refines it using formal verification feedback. Refinement simplifies proof discharge, while repair of models and proofs ensures the soundness of each refinement step. Together, these two components reinforce each other to progressively improve the model quality. Evaluation across systems of varying complexity demonstrates that Event-B Agent substantially outperforms baselines in end-to-end formal model synthesis and repair, while maintaining reasonable efficiency. These results suggest that Event-B Agent is a promising step toward correct-by-construction formal model synthesis and repair.

AIDec 9, 2024
The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap

Yedi Zhang, Yufan Cai, Xinyue Zuo et al.

Large Language Models (LLMs) have emerged as a transformative AI paradigm, profoundly influencing daily life through their exceptional language understanding and contextual generation capabilities. Despite their remarkable performance, LLMs face a critical challenge: the propensity to produce unreliable outputs due to the inherent limitations of their learning-based nature. Formal methods (FMs), on the other hand, are a well-established computation paradigm that provides mathematically rigorous techniques for modeling, specifying, and verifying the correctness of systems. FMs have been extensively applied in mission-critical software engineering, embedded systems, and cybersecurity. However, the primary challenge impeding the deployment of FMs in real-world settings lies in their steep learning curves, the absence of user-friendly interfaces, and issues with efficiency and adaptability. This position paper outlines a roadmap for advancing the next generation of trustworthy AI systems by leveraging the mutual enhancement of LLMs and FMs. First, we illustrate how FMs, including reasoning and certification techniques, can help LLMs generate more reliable and formally certified outputs. Subsequently, we highlight how the advanced learning capabilities and adaptability of LLMs can significantly enhance the usability, efficiency, and scalability of existing FM tools. Finally, we show that unifying these two computation paradigms -- integrating the flexibility and intelligence of LLMs with the rigorous reasoning abilities of FMs -- has transformative potential for the development of trustworthy AI software systems. We acknowledge that this integration has the potential to enhance both the trustworthiness and efficiency of software engineering practices while fostering the development of intelligent FM tools capable of addressing complex yet real-world challenges.