A Precision Emulation Approach to the GPU Acceleration of Ab Initio Electronic Structure Calculations

arXiv:2603.2997541.1
AI Analysis

This work addresses performance bottlenecks in scientific computing for HPC users, offering an incremental optimization through tunable precision emulation.

This study tackled the problem of accelerating traditional FP64-based HPC workloads on modern GPUs by using INT8-based emulation with SCILIB-Accel, achieving improved accuracy and performance simultaneously without code changes.

This study explores the use of INT8-based emulation for accelerating traditional FP64-based HPC workloads on modern GPU architectures. Through SCILIB-Accel automatic BLAS offload tool for cache-coherent Unified Memory Architecture, we emulate FP64 matrix multiplications in the LSMS CPU application in the MuST suite without code changes. We find that accuracy depends on both arithmetic precision and the properties of the operator, which can be dealt with through tunable precision emulation. Unlike traditional mixed-precision approaches, this method preserves original algorithms while optimizing hardware utilization. We showcase the potential of improving accuracy and performance at the same time. This work highlights the potential of AI-driven hardware to transform HPC, advocating for adaptive precision strategies in future scientific computing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes