LGMar 6, 2025

Neural Network Surrogate Model for Junction Temperature and Hotspot Position in $3$D Multi-Layer High Bandwidth Memory (HBM) Chiplets under Varying Thermal Conditions

arXiv:2503.04049v12 citationsh-index: 42025 IEEE 6th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)
Originality Synthesis-oriented
AI Analysis

This work addresses thermal prediction for HBM systems in high-performance computing, offering a tool to accelerate design and reduce reliance on costly experiments, but it is incremental as it applies existing neural network methods to a new domain-specific problem.

The paper tackles the thermal management challenge in high-bandwidth memory (HBM) chiplets by developing a neural network surrogate model to predict junction temperature and hotspot position under varying thermal conditions, achieving accurate and fast inference with a dataset of 13,494 parameter combinations and showing good generalizability.

As the demand for computational power increases, high-bandwidth memory (HBM) has become a critical technology for next-generation computing systems. However, the widespread adoption of HBM presents significant thermal management challenges, particularly in multilayer through-silicon-via (TSV) stacked structures under varying thermal conditions, where accurate prediction of junction temperature and hotspot position is essential during the early design. This work develops a data-driven neural network model for the fast prediction of junction temperature and hotspot position in 3D HBM chiplets. The model, trained with a data set of $13,494$ different combinations of thermal condition parameters, sampled from a vast parameter space characterized by high-dimensional combination (up to $3^{27}$), can accurately and quickly infer the junction temperature and hotspot position for any thermal conditions in the parameter space. Moreover, it shows good generalizability for other thermal conditions not considered in the parameter space. The data set is constructed using accurate finite element solvers. This method not only minimizes the reliance on costly experimental tests and extensive computational resources for finite element analysis but also accelerates the design and optimization of complex HBM systems, making it a valuable tool for improving thermal management and performance in high-performance computing applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes