LGARJul 14, 2025

Iceberg: Enhancing HLS Modeling with Synthetic Data

arXiv:2507.09948v2h-index: 9Has Code2025 IEEE International Conference on LLM-Aided Design (ICLAD)
Originality Incremental advance
AI Analysis

This addresses the generalizability gap in HLS modeling for hardware design, offering a domain-specific incremental improvement.

The paper tackles the generalization problem in deep learning models for High-Level Synthesis (HLS) by introducing Iceberg, a synthetic data augmentation approach that uses LLM-generated programs and weak labels, resulting in an 86.4% improvement in geometric mean modeling accuracy and up to 2.47x better offline DSE performance in real-world applications.

Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models through pretraining on synthetic data and introduce Iceberg, a synthetic data augmentation approach that expands both large language model (LLM)-generated programs and weak labels of unseen design configurations. Our weak label generation method is integrated with an in-context model architecture, enabling meta-learning from actual and proximate labels. Iceberg improves the geometric mean modeling accuracy by $86.4\%$ when adapt to six real-world applications with few-shot examples and achieves a $2.47\times$ and a $1.12\times$ better offline DSE performance when adapting to two different test datasets. Our open-sourced code is here: https://github.com/UCLA-VAST/iceberg

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes