Cristina David

SE
h-index6
10papers
153citations
Novelty53%
AI Score55

10 Papers

SYMay 6, 2017
Automated Formal Synthesis of Digital Controllers for State-Space Physical Plants

Alessandro Abate, Iury Bessa, Dario Cattaruzza et al.

We present a sound and automated approach to synthesize safe digital feedback controllers for physical plants represented as linear, time invariant models. Models are given as dynamical equations with inputs, evolving over a continuous state space and accounting for errors due to the digitalization of signals by the controller. Our approach has two stages, leveraging counterexample guided inductive synthesis (CEGIS) and reachability analysis. CEGIS synthesizes a static feedback controller that stabilizes the system under restrictions given by the safety of the reach space. Safety is verified either via BMC or abstract acceleration; if the verification step fails, we refine the controller by generalizing the counterexample. We synthesize stable and safe controllers for intricate physical plant models from the digital control literature.

SYFeb 16, 2017
Sound and Automated Synthesis of Digital Stabilizing Controllers for Continuous Plants

Alessandro Abate, Iury Bessa, Dario Cattaruzza et al.

Modern control is implemented with digital microcontrollers, embedded within a dynamical plant that represents physical components. We present a new algorithm based on counter-example guided inductive synthesis that automates the design of digital controllers that are correct by construction. The synthesis result is sound with respect to the complete range of approximations, including time discretization, quantization effects, and finite-precision arithmetic and its rounding errors. We have implemented our new algorithm in a tool called DSSynth, and are able to automatically generate stable controllers for a set of intricate plant models taken from the literature within minutes.

SEJul 28, 2022
Using Graph Neural Networks for Program Termination

Yoav Alon, Cristina David

Termination analyses investigate the termination behavior of programs, intending to detect nontermination, which is known to cause a variety of program bugs (e.g. hanging programs, denial-of-service vulnerabilities). Beyond formal approaches, various attempts have been made to estimate the termination behavior of programs using neural networks. However, the majority of these approaches continue to rely on formal methods to provide strong soundness guarantees and consequently suffer from similar limitations. In this paper, we move away from formal methods and embrace the stochastic nature of machine learning models. Instead of aiming for rigorous guarantees that can be interpreted by solvers, our objective is to provide an estimation of a program's termination behavior and of the likely reason for nontermination (when applicable) that a programmer can use for debugging purposes. Compared to previous approaches using neural networks for program termination, we also take advantage of the graph representation of programs by employing Graph Neural Networks. To further assist programmers in understanding and debugging nontermination bugs, we adapt the notions of attention and semantic segmentation, previously used for other application domains, to programs. Overall, we designed and implemented classifiers for program termination based on Graph Convolutional Networks and Graph Attention Networks, as well as a semantic segmentation Graph Neural Network that localizes AST nodes likely to cause nontermination. We also illustrated how the information provided by semantic segmentation can be combined with program slicing to further aid debugging.

SEMay 9Code
Using Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation

Weilin He, Arindam Sharma, Cristina David

LLMs show strong performance in code generation, but their outputs lack correctness guarantees. Sample-based uncertainty estimators address this by generating multiple candidate programs and measuring their disagreement. However, existing estimators make different design choices about how behaviours are identified, aggregated, referenced and compared, making them difficult to assess. We therefore first introduce a taxonomy that disentangles these choices and reveals a missing design point: semantic distance-aware uncertainty estimation, which measures not only whether sampled programs disagree, but how severely their execution behaviours differ. Across LiveCodeBench, MBPP, HumanEval-X and BigCodeBench, spanning Python, Java and C++, our metrics provide strong proxies for correctness, and consistently outperform state-of-the-art sample-based baselines across both closed-source models (GPT-3.5-Turbo, GPT-4o-mini, Gemini-2.5-Flash-Lite, Claude Opus 4.5) and an open-source model (DeepSeek-Coder-V2). The method is practical: it requires neither model internals nor LLM-as-judge calls, remains robust across models, languages, sampling temperatures and fuzzing settings, and reduces runtime by approximately 48-79% relative to existing baselines.

DBAug 25, 2024Code
Enhancing SQL Query Generation with Neurosymbolic Reasoning

Henrijs Princis, Cristina David, Alan Mycroft

Neurosymbolic approaches blend the effectiveness of symbolic reasoning with the flexibility of neural networks. In this work, we propose a neurosymbolic architecture for generating SQL queries that builds and explores a solution tree using Best-First Search, with the possibility of backtracking. For this purpose, it integrates a Language Model (LM) with symbolic modules that help catch and correct errors made by the LM on SQL queries, as well as guiding the exploration of the solution tree. We focus on improving the performance of smaller open-source LMs, and we find that our tool, Xander, increases accuracy by an average of 10.9% and reduces runtime by an average of 28% compared to the LM without Xander, enabling a smaller LM (with Xander) to outperform its four-times larger counterpart (without Xander).

LGNov 27, 2025Code
TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation

Henrijs Princis, Arindam Sharma, Cristina David

Large language models (LLMs) have shown remarkable ability to generate code, yet their outputs often violate syntactic or semantic constraints when guided only through natural language prompts. We introduce TreeCoder, the most general and flexible framework to date for exploring decoding strategies, constraints, and hyperparameters in LLMs, and use it in code generation to enforce correctness and structure during decoding rather than relying on prompt engineering. TreeCoder represents decoding as a tree search over candidate programs, where both decoding strategies and constraint functions - such as style, syntax, execution - are treated as first-class, optimisable components. This design enables systematic exploration and automatic tuning of decoding configurations using standard optimisation techniques. Experiments on the MBPP (Python) and SQL-Spider benchmarks show that TreeCoder consistently improves accuracy across open-source models such as CodeLlama, Mistral and DeepSeek, often outperforming their unconstrained baselines by considerable margins.

PLApr 17
Literate Execution

Joe Bond, Jacob Pake, Cristina David et al.

\emph{Literate programming}, introduced by Knurth, interleaves code and prose so that a program can be read as both executable and explanatory text. We propose \emph{literate execution}, which inverts this relationship: rather than embedding code within a static narrative, we treat documentation -- and other expository elements such as visualisations -- as first-class artefacts that can be computed alongside a running program and then integrated into a view of its execution. We explore this idea through Fluid, a programming language with a provenance-tracking runtime that records fine-grained dependencies between inputs and outputs. These provenance relationships can be surfaced as interactions that allow readers to explore how intermediate values contribute to a result. By integrating visualisation, provenance, and exposition, literate execution aims to make programs more explorable and self-explanatory, and explorable explanations easier to program.

PLMar 25
Transformers for Program Termination

Yoav Alon, Cristina David

Determining whether a program terminates is a core challenge in program analysis with direct implications for correctness, verification, and security. We investigate whether transformer architectures can recognise termination patterns directly from source code and how their strengths can be amplified through ensembles. To overcome the extreme scarcity of non-terminating examples, we design an ensemble framework of compact transformer encoders, systematically trained with a suite of imbalance-aware loss functions and class-aware sampling techniques. By combining models trained with distinct loss functions, our ensembles achieve substantially stronger performance than any single transformer, outperforming both powerful off-the-shelf LLMs and graph-based methods. Finally, we introduce an attribution pipeline that produces syntax-aware explanations for the termination estimation.

LGOct 17, 2024
Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning

Yoav Alon, Cristina David

Large Language Models (LLMs) were shown to struggle with long-term planning, which may be caused by the limited way in which they explore the space of possible solutions. We propose an architecture where a Reinforcement Learning (RL) Agent guides an LLM's space exploration: (1) the Agent has access to domain-specific information, and can therefore make decisions about the quality of candidate solutions based on specific and relevant metrics, which were not explicitly considered by the LLM's training objective; (2) the LLM can focus on generating immediate next steps, without the need for long-term planning. We allow non-linear reasoning by exploring alternative paths and backtracking. We evaluate this architecture on the program equivalence task, and compare it against Chain of Thought (CoT) and Tree of Thoughts (ToT). We assess both the downstream task, denoting the binary classification, and the intermediate reasoning steps. Our approach compares positively against CoT and ToT.

SEMay 18, 2015
Synthesising Interprocedural Bit-Precise Termination Proofs (extended version)

Hong-Yi Chen, Cristina David, Daniel Kroening et al.

Proving program termination is key to guaranteeing absence of undesirable behaviour, such as hanging programs and even security vulnerabilities such as denial-of-service attacks. To make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unexplored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present a modular termination analysis for C programs using template-based interprocedural summarisation. Our analysis combines a context-sensitive, over-approximating forward analysis with the inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexicographic linear ranking function templates. Our experimental results show that our tool 2LS outperforms state-of-the-art alternatives, and demonstrate the clear advantage of interprocedural reasoning over monolithic analysis in terms of efficiency, while retaining comparable precision.