HDLCoRe: A Training-Free Framework for Mitigating Hallucinations in LLM-Generated HDL
This addresses the problem of unreliable code generation for hardware designers, but it is incremental as it builds on existing techniques like CoT and RAG without introducing new model architectures.
The paper tackles the problem of hallucinations and incorrect code generation in large language models (LLMs) when applied to hardware description languages (HDL) due to data scarcity, proposing HDLCoRe, a training-free framework that uses prompt engineering and retrieval-augmented generation to enhance HDL generation, achieving superior performance on the RTLLM2.0 benchmark with significant reductions in hallucinations and improvements in correctness.
Recent advances in large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, when applied to hardware description languages (HDL), these models exhibit significant limitations due to data scarcity, resulting in hallucinations and incorrect code generation. To address these challenges, we propose HDLCoRe, a training-free framework that enhances LLMs' HDL generation capabilities through prompt engineering techniques and retrieval-augmented generation (RAG). Our approach consists of two main components: (1) an HDL-aware Chain-of-Thought (CoT) prompting technique with self-verification that classifies tasks by complexity and type, incorporates domain-specific knowledge, and guides LLMs through step-by-step self-simulation for error correction; and (2) a two-stage heterogeneous RAG system that addresses formatting inconsistencies through key component extraction and efficiently retrieves relevant HDL examples through sequential filtering and re-ranking. HDLCoRe eliminates the need for model fine-tuning while substantially improving LLMs' HDL generation capabilities. Experimental results demonstrate that our framework achieves superior performance on the RTLLM2.0 benchmark, significantly reducing hallucinations and improving both syntactic and functional correctness.