SEAIMay 14, 2024

Automated Repair of AI Code with Large Language Models and Formal Verification

arXiv:2405.08848v19 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses safety guarantees for AI systems by improving code reliability, though it is incremental as it builds on existing datasets and methods.

The paper tackles the problem of memory safety vulnerabilities in neural network code by automatically detecting them with formal verification and repairing them using large language models, achieving repairs on a dataset expanded to about 81k programs.

The next generation of AI systems requires strong safety guarantees. This report looks at the software implementation of neural networks and related memory safety properties, including NULL pointer deference, out-of-bound access, double-free, and memory leaks. Our goal is to detect these vulnerabilities, and automatically repair them with the help of large language models. To this end, we first expand the size of NeuroCodeBench, an existing dataset of neural network code, to about 81k programs via an automated process of program mutation. Then, we verify the memory safety of the mutated neural network implementations with ESBMC, a state-of-the-art software verifier. Whenever ESBMC spots a vulnerability, we invoke a large language model to repair the source code. For the latest task, we compare the performance of various state-of-the-art prompt engineering techniques, and an iterative approach that repeatedly calls the large language model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes