AR LGOct 21, 2021

Supporting Massive DLRM Inference Through Software Defined Memory

Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri

arXiv:2110.11489v25.127 citations

Originality Incremental advance

AI Analysis

This addresses the growing power and cost issues for data centers running large DLRMs, though it appears incremental in optimizing existing memory technologies.

The paper tackles the challenge of massive Deep Learning Recommendation Model (DLRM) inference by extending the memory hierarchy to Storage Class Memory (SCM) using Software Defined Memory, achieving power savings of 5% to 29%.

Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents different techniques to improve performance through a Software Defined Memory. We show how underlying technologies such as Nand Flash and 3DXP differentiate, and relate to real world scenarios, enabling from 5% to 29% power savings.

View on arXiv PDF

Similar