Sheldon X. -D. Tan

AR
h-index3
6papers
6citations
Novelty51%
AI Score44

6 Papers

ARMar 19
WarPGNN: A Parametric Thermal Warpage Analysis Framework with Physics-aware Graph Neural Network

Haotian Lu, Jincong Lu, Sachin Sachdeva et al.

With the advent of system-in-package (SiP) chiplet-based design and heterogeneous 2.5D/3D integration, thermal-induced warpage has become a critical reliability concern. While conventional numerical approaches can deliver highly accurate results, they often incur prohib- itively high computational costs, limiting their scalability for complex chiplet-package systems. In this paper, we present WarPGNN, an ef- ficient and accurate parametric thermal warpage analysis framework powered by Graph Neural Networks (GNNs). By operating directly on graphs constructed from the floorplans, WarPGNN enables fast warpage-aware floorplan exploration and exhibits strong transfer- ability across diverse package configurations. Our method first en- codes multi-die floorplans into reduced Transitive Closure Graphs (rTCGs), then a Graph Convolution Network (GCN)-based encoder extracts hierarchical structural features, followed by a U-Net inspired decoder that reconstructs warpage maps from graph feature embed- dings. Furthermore, to address the long-tailed pattern of warpage data distribution, we developed a physics-informed loss and revised a message-passing encoder based on Graph Isomorphic Network (GIN) that further enhance learning performance for extreme cases and expressiveness of graph embeddings. Numerical results show that WarPGNN achieves more than 205.91x speedup compared with the 2-D efficient FEM-based method and over 119766.64x acceleration with 3-D FEM method COMSOL, respectively, while maintaining comparable accuracy at only 1.26% full-scale normalized RMSE and 2.21% warpage value error. Compared with recent DeepONet-based model, our method achieved comparable prediction accuracy and in- ference speedup with 3.4x lower training time. In addition, WarPGNN demonstrates remarkable transferability on unseen datasets with up to 3.69% normalized RMSE and similar runtime.

ARMar 26
Accelerating Physics-Based Electromigration Analysis via Rational Krylov Subspaces

Sheldon X. -D. Tan, Haotian Lu

Electromigration (EM) induced stress evolution is a major reliability challenge in nanometer-scale VLSI interconnects. Accurate EM analysis requires solving stress-governing partial differential equations over large interconnect trees, which is computationally expensive using conventional finite-difference methods. This work proposes two fast EM stress analysis techniques based on rational Krylov subspace reduction. Unlike traditional Krylov methods that expand around zero frequency, rational Krylov methods enable expansion at selected time constants, aligning directly with metrics such as nucleation and steady-state times and producing compact reduced models with minimal accuracy loss. Two complementary frameworks are developed: a frequency-domain extended rational Krylov method, ExtRaKrylovEM, and a time-domain rational Krylov exponential integration method, EiRaKrylovEM. We show that the accuracy of both methods depends strongly on the choice of expansion point, or shift time, and demonstrate that effective shift times are typically close to times of interest such as nucleation or post-void steady state. Based on this observation, a coordinate descent optimization strategy is introduced to automatically determine optimal reduction orders and shift times for both nucleation and post-void phases. Experimental results on synthesized structures and industry-scale power grids show that the proposed methods achieve orders-of-magnitude improvements in efficiency and accuracy over finite-difference solutions. Using only 4 to 6 Krylov orders, the methods achieve sub-0.1 percent error in nucleation time and resistance change predictions while delivering 20 to 500 times speedup. In contrast, standard extended Krylov methods require more than 50 orders and still incur 10 to 20 percent nucleation time error, limiting their practicality for EM-aware optimization and stochastic EM analysis.

ARApr 14
EMSpice 3: Full-chip Temperature-Aware Multiphysics Electromigration and IR-Drop Analysis

Haotian Lu, Sheldon X. -D. Tan

In this work, we present EMSpice 3, a full-chip temperature-aware multiphysics framework for coupled electromigration (EM), thermomigration (TM), and IR-drop analysis of power-grid networks. It unifies extracted netlists, configurable parameters, and optional chip-level thermal maps into a single flow supporting temperature-aware immortality screening, transient EM/TM stress simulation with iterative resistance updates, and optional Monte Carlo lifetime analysis. To accelerate large-tree simulations, EMSpice 3 integrates an extended rational Krylov reduction method into the transient solver without loss of accuracy. It also interfaces with Synopsys ICC and Fusion Compiler for practical deployment. By incorporating realistic spatial thermal maps into the reliability loop, the framework enables map-aware EM sign-off beyond average-temperature assumptions. Experiments on six designs show that spatial thermal variation significantly impacts EM reliability even with identical average temperature. For a RISC-V core, equal-average thermal profiles yield over 70% spread in time to failure (TTF), while an ARM Cortex-A core shows nearly 50%. The Krylov-accelerated solver achieves 1.18x - 1.50x runtime reduction. Monte Carlo analysis reveals strong design dependence: under 20% variation in diffusivity and critical stress, TTF variation is about 25\% for RISC-V but only 0.03% for ARM. These results demonstrate that EMSpice 3 enables practical, map-aware, and workload-aware full-chip EM reliability assessment.

LGMar 18, 2025
BPINN-EM-Post: Bayesian Physics-Informed Neural Network based Stochastic Electromigration Damage Analysis in the Post-void Phase

Subed Lamichhane, Haotian Lu, Sheldon X. -D. Tan

In contrast to the assumptions of most existing Electromigration (EM) analysis tools, the evolution of EM-induced stress is inherently non-deterministic, influenced by factors such as input current fluctuations and manufacturing non-idealities. Traditional approaches for estimating stress variations typically involve computationally expensive and inefficient Monte Carlo simulations with industrial solvers, which quantify variations using mean and variance metrics. In this work, we introduce a novel machine learning-based framework, termed BPINN-EM- Post, for efficient stochastic analysis of EM-induced post-voiding aging processes. For the first time, our new approach integrates closed-form analytical solutions with a Bayesian Physics- Informed Neural Network (BPINN) framework to accelerate the analysis. The closed-form solutions enforce physical laws at the individual wire segment level, while the BPINN ensures that physics constraints at inter-segment junctions are satisfied and stochastic behaviors are accurately modeled. By reducing the number of variables in the loss functions through utilizing analytical solutions, our method significantly improves training efficiency without accuracy loss and naturally incorporates variational effects. Additionally, the analytical solutions effectively address the challenge of incorporating initial stress distributions in interconnect structures during post-void stress calculations. Numerical results demonstrate that BPINN-EM-Post achieves over 240x and more than 67x speedup compared to Monte Carlo simulations using the FEM-based COMSOL solver and FDM-based EMSpice, respectively, with marginal accuracy loss.

LGApr 27, 2020
EM-GAN: Fast Stress Analysis for Multi-Segment Interconnect Using Generative Adversarial Networks

Wentian Jin, Sheriff Sadiqbatcha, Jinwei Zhang et al.

In this paper, we propose a fast transient hydrostatic stress analysis for electromigration (EM) failure assessment for multi-segment interconnects using generative adversarial networks (GANs). Our work leverages the image synthesis feature of GAN-based generative deep neural networks. The stress evaluation of multi-segment interconnects, modeled by partial differential equations, can be viewed as time-varying 2D-images-to-image problem where the input is the multi-segment interconnects topology with current densities and the output is the EM stress distribution in those wire segments at the given aging time. Based on this observation, we train conditional GAN model using the images of many self-generated multi-segment wires and wire current densities and aging time (as conditions) against the COMSOL simulation results. Different hyperparameters of GAN were studied and compared. The proposed algorithm, called {\it EM-GAN}, can quickly give accurate stress distribution of a general multi-segment wire tree for a given aging time, which is important for full-chip fast EM failure assessment. Our experimental results show that the EM-GAN shows 6.6\% averaged error compared to COMSOL simulation results with orders of magnitude speedup. It also delivers 8.3X speedup over state-of-the-art analytic based EM analysis solver.

CVMay 21, 2018
DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization

Yuan Cheng, Guangya Li, Hai-Bao Chen et al.

As it requires a huge number of parameters when exposed to high dimensional inputs in video detection and classification, there is a grand challenge to develop a compact yet accurate video comprehension at terminal devices. Current works focus on optimizations of video detection and classification in a separated fashion. In this paper, we introduce a video comprehension (object detection and action recognition) system for terminal devices, namely DEEPEYE. Based on You Only Look Once (YOLO), we have developed an 8-bit quantization method when training YOLO; and also developed a tensorized-compression method of Recurrent Neural Network (RNN) composed of features extracted from YOLO. The developed quantization and tensorization can significantly compress the original network model yet with maintained accuracy. Using the challenging video datasets: MOMENTS and UCF11 as benchmarks, the results show that the proposed DEEPEYE achieves 3.994x model compression rate with only 0.47% mAP decreased; and 15,047x parameter reduction and 2.87x speed-up with 16.58% accuracy improvement.