Can Large Language Models Autoformalize Kinematics?
This addresses the bottleneck of manually creating formal models for autonomous systems like robots and self-driving cars, representing an incremental step in applying LLMs to formal methods.
The paper tackles the problem of automating the formalization of physics models for cyber-physical systems by evaluating whether Large Language Models (LLMs) can convert natural language descriptions of kinematics problems into differential game logic (dGL) models, achieving a 70% success rate on a 20-problem benchmark.
Autonomous cyber-physical systems like robots and self-driving cars could greatly benefit from using formal methods to reason reliably about their control decisions. However, before a problem can be solved it needs to be stated. This requires writing a formal physics model of the cyber-physical system, which is a complex task that traditionally requires human expertise and becomes a bottleneck. This paper experimentally studies whether Large Language Models (LLMs) can automate the formalization process. A 20 problem benchmark suite is designed drawing from undergraduate level physics kinematics problems. In each problem, the LLM is provided with a natural language description of the objects' motion and must produce a model in differential game logic (dGL). The model is (1) syntax checked and iteratively refined based on parser feedback, and (2) semantically evaluated by checking whether symbolically executing the dGL formula recovers the solution to the original physics problem. A success rate of 70% (best over 5 samples) is achieved. We analyze failing cases, identifying directions for future improvement. This provides a first quantitative baseline for LLM-based autoformalization from natural language to a hybrid games logic with continuous dynamics.