RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference
This addresses the problem of high infrastructure costs and performance degradation for datacenter applications like search and social media by offering a more efficient storage solution, though it is incremental as it builds on existing near data processing concepts.
The paper tackles the high latency and bandwidth issues of using conventional SSDs for neural recommendation inference by introducing RecSSD, a near data processing based SSD memory system, which reduces end-to-end inference latency by 2X across eight industry-representative models.
Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment. State-of-the-art models comprise large embedding tables that have billions of parameters requiring large memory capacities. Unfortunately, large and fast DRAM-based memories levy high infrastructure costs. Conventional SSD-based storage solutions offer an order of magnitude larger capacity, but have worse read latency and bandwidth, degrading inference performance. RecSSD is a near data processing based SSD memory system customized for neural recommendation inference that reduces end-to-end model inference latency by 2X compared to using COTS SSDs across eight industry-representative models.