Jiayu Zhao

LG
h-index1
3papers
6citations
Novelty50%
AI Score45

3 Papers

65.1LGMay 22Code
BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization

Jiayu Zhao, Zihan Teng, Minhao Fan et al.

Mixture-of-Experts (MoE) large language models reduce per-token computation through sparse expert activation, but their deployment remains memory-intensive because all expert weights must be kept resident in memory. Existing MoE compression methods struggle in the ultra-low-bit regime: pruning irreversibly removes model capacity, while coarse-grained quantization fails to allocate bits according to heterogeneous expert and weight-direction importance. We propose BitsMoE, a spectral-energy-guided bit-allocation framework for MoE LLM quantization. BitsMoE decomposes each MoE layer by SVD into a shared basis and expert-specific spectral factors, retaining the shared basis without quantization to preserve common cross-expert structure and using the expert-specific factors as fine-grained quantization units. To determine the bit-width of each unit, BitsMoE formulates spectrum-wise mixed-precision quantization as an activation-aware reconstruction surrogate and solves an integer linear program that minimizes estimated reconstruction loss under a fixed bit budget. Experiments across multiple MoE LLMs show that BitsMoE substantially reduces downstream task accuracy degradation in ultra-low-bit regimes. Under 2-bit quantization on Qwen3-30B-A3B-Base, BitsMoE accelerates quantization by 12.3$\times$, improves average accuracy by 27.83 percentage points, and increases decoding speed by 1.76$\times$ over GPTQ. Our model and code are publicly available at https://github.com/zjiayu064/BitsMoE.

LGFeb 16, 2023
Unleashing the Potential of Acquisition Functions in High-Dimensional Bayesian Optimization

Jiayu Zhao, Renyu Yang, Shenghao Qiu et al.

Bayesian optimization (BO) is widely used to optimize expensive-to-evaluate black-box functions.BO first builds a surrogate model to represent the objective function and assesses its uncertainty. It then decides where to sample by maximizing an acquisition function (AF) based on the surrogate model. However, when dealing with high-dimensional problems, finding the global maximum of the AF becomes increasingly challenging. In such cases, the initialization of the AF maximizer plays a pivotal role, as an inadequate setup can severely hinder the effectiveness of the AF. This paper investigates a largely understudied problem concerning the impact of AF maximizer initialization on exploiting AFs' capability. Our large-scale empirical study shows that the widely used random initialization strategy often fails to harness the potential of an AF. In light of this, we propose a better initialization approach by employing multiple heuristic optimizers to leverage the historical data of black-box optimization to generate initial points for the AF maximize. We evaluate our approach with a range of heavily studied synthetic functions and real-world applications. Experimental results show that our techniques, while simple, can significantly enhance the standard BO and outperform state-of-the-art methods by a large margin in most test cases.

ARSep 27, 2025
VeriGRAG: Enhancing LLM-Based Verilog Code Generation with Structure-Aware Soft Prompts

Jiayu Zhao, Song Chen

Large language models (LLMs) have demonstrated strong capabilities in generating Verilog code from natural language descriptions. However, Verilog code inherently encodes structural information of hardware circuits. Effectively leveraging this structural information to enhance the functional and syntactic correctness of LLM-generated Verilog code remains a significant challenge. To address this challenge, we propose VeriGRAG , a novel framework that extracts structural graph embeddings from Verilog code using graph neural networks (GNNs). A multimodal retriever then selects the graph embeddings most relevant to the given generation task, which are aligned with the code modality through the VeriFormer module to generate structure-aware soft prompts. Our experiments demonstrate that VeriGRAG substantially improves the correctness of Verilog code generation, achieving state-of-the-art or superior performance across both VerilogEval and RTLLM benchmarks.