ARAILGJul 23, 2024

OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection

arXiv:2407.16237v282 citationsh-index: 10Has Code
AI Analysis

This addresses privacy and security concerns in RTL code generation for hardware design by providing a competitive open-source alternative, though it is incremental as it builds on existing LLM methods.

The paper tackles the problem of open-source LLMs underperforming in RTL code generation due to scarce high-quality datasets by introducing OriGen, a framework that uses code-to-code augmentation and self-reflection, resulting in a 12.8% improvement over the best open-source LLM and outperforming GPT-4 Turbo in pass@1 on VerilogEval-Human.

Recent studies have demonstrated the significant potential of Large Language Models (LLMs) in generating Register Transfer Level (RTL) code, with notable advancements showcased by commercial models such as GPT-4 and Claude3-Opus. However, these proprietary LLMs often raise concerns regarding privacy and security. While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets. Furthermore, OriGen can rectify syntactic errors through a self-reflection process that leverages compiler feedback. Experimental results demonstrate that OriGen significantly outperforms other open-source alternatives in RTL code generation. It surpasses the previous best-performing open-source LLM by 12.8% and even exceeds GPT-4 Turbo in the pass@1 metric on the VerilogEval-Human benchmark. Moreover, OriGen exhibits superior capabilities in self-reflection and error correction, outperforming GPT-4 by 19.9% on a benchmark designed to evaluate self-reflection capabilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes