Zedong Peng

LG
h-index11
5papers
37citations
Novelty29%
AI Score44

5 Papers

OCDec 12, 2024Code
MPAX: Mathematical Programming in JAX

Haihao Lu, Zedong Peng, Jinwen Yang

This paper presents MPAX (Mathematical Programming in JAX), a versatile and efficient toolbox for integrating linear programming (LP) into machine learning workflows. MPAX implemented the state-of-the-art first-order methods, restarted average primal-dual hybrid gradient and reflected restarted Halpern primal-dual hybrid gradient, to solve LPs in JAX. This provides native support for hardware accelerations along with features like batch solving, auto-differentiation, and device parallelism. Extensive numerical experiments demonstrate the advantages of MPAX over existing solvers. The solver is available at https://github.com/MIT-Lu-Lab/MPAX.

ARJul 4, 2025Code
ForgeHLS: A Large-Scale, Open-Source Dataset for High-Level Synthesis

Zedong Peng, Zeju Li, Mingzhe Gao et al.

High-Level Synthesis (HLS) plays a crucial role in modern hardware design by transforming high-level code into optimized hardware implementations. However, progress in applying machine learning (ML) to HLS optimization has been hindered by a shortage of sufficiently large and diverse datasets. To bridge this gap, we introduce ForgeHLS, a large-scale, open-source dataset explicitly designed for ML-driven HLS research. ForgeHLS comprises over 400k diverse designs generated from 846 kernels covering a broad range of application domains, consuming over 200k CPU hours during dataset construction. Each kernel includes systematically automated pragma insertions (loop unrolling, pipelining, array partitioning), combined with extensive design space exploration using Bayesian optimization. Compared to existing datasets, ForgeHLS significantly enhances scale, diversity, and design coverage. We further define and evaluate representative downstream tasks in Quality of Result (QoR) prediction and automated pragma exploration, clearly demonstrating ForgeHLS utility for developing and improving ML-based HLS optimization methodologies. The dataset and code are public at https://github.com/zedong-peng/ForgeHLS.

LGFeb 25, 2025
DeepCircuitX: A Comprehensive Repository-Level Dataset for RTL Code Understanding, Generation, and PPA Analysis

Zeju Li, Changran Xu, Zhengyuan Shi et al.

This paper introduces DeepCircuitX, a comprehensive repository-level dataset designed to advance RTL (Register Transfer Level) code understanding, generation, and power-performance-area (PPA) analysis. Unlike existing datasets that are limited to either file-level RTL code or physical layout data, DeepCircuitX provides a holistic, multilevel resource that spans repository, file, module, and block-level RTL code. This structure enables more nuanced training and evaluation of large language models (LLMs) for RTL-specific tasks. DeepCircuitX is enriched with Chain of Thought (CoT) annotations, offering detailed descriptions of functionality and structure at multiple levels. These annotations enhance its utility for a wide range of tasks, including RTL code understanding, generation, and completion. Additionally, the dataset includes synthesized netlists and PPA metrics, facilitating early-stage design exploration and enabling accurate PPA prediction directly from RTL code. We demonstrate the dataset's effectiveness on various LLMs finetuned with our dataset and confirm the quality with human evaluations. Our results highlight DeepCircuitX as a critical resource for advancing RTL-focused machine learning applications in hardware design automation.Our data is available at https://zeju.gitbook.io/lcm-team.

OCApr 24
FlashFolio: A GPU-Accelerated Solver for Portfolio Optimization

Yilun Jiang, Haihao Lu, Zedong Peng et al.

We present FlashFolio, a GPU-accelerated solver for single-period and multi-period portfolio optimization with factor-based risk modeling, bid-offer spread costs, and nonlinear market impact. These models are widely used in portfolio construction and optimal execution, but become computationally challenging at large scale, especially in the multi-period setting. We benchmark FlashFolio against MOSEK on instances constructed from realistic market inputs. FlashFolio delivers consistent runtime improvements, achieving speedups of up to 12.9x in the single-period setting and 48x in the multi-period setting, while also exhibiting stronger robustness on challenging multi-period instances. Our results show that GPU-based optimization can help improve the practicality of large-scale portfolio optimization.

LGApr 10
DiffHLS: Differential Learning for High-Level Synthesis QoR Prediction with GNNs and LLM Code Embeddings

Zedong Peng, Zeju Li, Qiang Xu et al.

High-Level Synthesis (HLS) compiles C/C++ into RTL, but exploring pragma-driven optimization choices remains expensive because each design point requires time-consuming synthesis. We propose \textbf{\DiffHLS}, a differential learning framework for HLS Quality-of-Result (QoR) prediction that learns from kernel--design pairs: a kernel baseline and a pragma-inserted design variant. \DiffHLS~encodes kernel and design intermediate-representation graphs with dedicated graph neural network (GNN) branches, and augments the delta pathway with code embeddings from a pretrained code large language model (LLM). Instead of regressing absolute targets directly, we jointly predict the kernel baseline and the design-induced delta, and compose them to obtain the design prediction. On PolyBench, \DiffHLS~attains lower average MAPE than GNN baselines under four GNN backbones, and LLM code embeddings consistently improve over a GNN-only ablation. We further validate scalability on the ForgeHLS dataset.