LGAIPLNov 19, 2025

TB or Not TB: Coverage-Driven Direct Preference Optimization for Verilog Stimulus Generation

arXiv:2511.15767v12 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This addresses the labor-intensive task of stimulus generation in hardware verification, offering a domain-specific incremental improvement.

The paper tackles the problem of generating effective stimuli for hardware design verification, a time-consuming phase, by introducing a framework called TB or not TB that uses Coverage-Driven Direct Preference Optimization (CD-DPO) with LLMs, resulting in up to 77.27% improvement in code coverage on a benchmark.

With the rapid advancement of Large Language Models (LLMs), there is growing interest in applying them to hardware design and verification. Among these stages, design verification remains the most time-consuming and resource-intensive phase, where generating effective stimuli for the design under test (DUT) is both critical and labor-intensive. We present {\it TB or not TB}, a framework for automated stimulus generation using LLMs fine-tuned through Coverage-Driven Direct Preference Optimization (CD-DPO). To enable preference-based training, we introduce PairaNet, a dataset derived from PyraNet that pairs high- and low-quality testbenches labeled using simulation-derived coverage metrics. The proposed CD-DPO method integrates quantitative coverage feedback directly into the optimization objective, guiding the model toward generating stimuli that maximize verification coverage. Experiments on the CVDP CID12 benchmark show that {\it TB or not TB} outperforms both open-source and commercial baselines, achieving up to 77.27\% improvement in code coverage, demonstrating the effectiveness of Coverage-driven preference optimization for LLM-based hardware verification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes