Memory-Oriented Design-Space Exploration of Edge-AI Hardware for XR Applications
This work addresses energy and area efficiency for on-device XR-AI inference, though it is incremental as it applies existing memory technologies to specific workloads.
The paper tackles the problem of optimizing edge-AI hardware for extended reality (XR) applications by exploring design spaces with non-volatile memory, achieving energy savings of at least 24% and area reductions of at least 30% for hand detection and eye segmentation workloads at the 7nm technology node.
Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (>=24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing non-volatile memory in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (>=30%) owing to the small form factor of MRAM compared to traditional SRAM.