ARCLApr 11

VeriInteresting: An Empirical Study of Model Prompt Interactions in Verilog Code Generation

arXiv:2603.0871577.71 citationsh-index: 13
AI Analysis

For hardware engineers using LMs for Verilog code generation, this study provides practical guidance on prompt engineering, though findings are incremental and domain-specific.

This paper empirically maps interactions between model characteristics and prompt design strategies for Verilog code generation, finding that structured prompts and optimization improve performance across diverse LMs, with some trends generalizing and others being model-specific.

Rapid advances in language models (LMs) have created new opportunities for automated code generation while complicating trade-offs between model characteristics and prompt design choices. In this work, we provide an empirical map of recent trends in LMs for Verilog code generation, focusing on interactions among model reasoning, specialization, and prompt engineering strategies. We evaluate a diverse set of small and large LMs, including general-purpose, reasoning, and domain-specific variants. Our experiments use a controlled factorial design spanning benchmark prompts, structured outputs, prompt rewriting, chain-of-thought reasoning, in-context learning, and evolutionary prompt optimization via Genetic-Pareto. Across two Verilog benchmarks, we identify patterns in how model classes respond to structured prompts and optimization, and we document which trends generalize across LMs and benchmarks versus those that are specific to particular model-prompt combinations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes