PL ARMar 12

AutoVeriFix+: High-Correctness RTL Generation via Trace-Aware Causal Fix and Semantic Redundancy Pruning

Yan Tan, Xiangchen Meng, Zijun Jiang, Yangdi Lyu

arXiv:2603.11489v19.2h-index: 14

Predicted impact top 13% in PL · last 90 daysOriginality Highly original

AI Analysis

This addresses the challenge of hardware description language generation for engineers, offering a novel approach to improve correctness and efficiency in circuit design.

The paper tackles the problem of generating functionally correct Verilog code using large language models by proposing AutoVeriFix+, a three-stage framework that integrates high-level semantic reasoning and state-space exploration, achieving over 80% functional correctness and a pass@10 score of 90.2% on benchmarks while reducing redundant logic by 25%.

Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as Python and C++. However, their application to hardware description languages, such as Verilog, is challenging due to the scarcity of high-quality training data. Current approaches to Verilog code generation using LLMs often focus on syntactic correctness, resulting in code with functional errors. To address these challenges, we propose AutoVeriFix+, a novel three-stage framework that integrates high-level semantic reasoning with state-space exploration to enhance functional correctness and design efficiency. In the first stage, an LLM is employed to generate high-level Python reference models that define the intended circuit behavior. In the second stage, another LLM generates initial Verilog RTL candidates and iteratively fixes syntactic errors. In the third stage, we introduce a Concolic testing engine to exercise deep sequential logic and identify corner-case vulnerabilities. With cycle-accurate execution traces and internal register snapshots, AutoVeriFix+ provides the LLM with the causal context necessary to resolve complex state-transition errors. Furthermore, it will generate a coverage report to identify functionally redundant branches, enabling the LLM to perform semantic pruning for area optimization. Experimental results demonstrate that AutoVeriFix+ achieves over 80% functional correctness on rigorous benchmarks, reaching a pass@10 score of 90.2% on the VerilogEval-machine dataset. In addition, it eliminates an average of 25% redundant logic across benchmarks through trace-aware optimization.

View on arXiv PDF

Similar