ETAIARJul 2, 2025

Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems

arXiv:2507.01429v13 citationsh-index: 39J syst archit
Originality Incremental advance
AI Analysis

This work addresses energy and area constraints for embedded AI applications using a novel hardware-software co-design approach, though it is incremental as it builds on existing in-memory computing and racetrack memory technologies.

The paper tackles the challenge of building efficient in-memory arithmetic circuits for CNN inference on racetrack memory in embedded systems, achieving significant improvements in energy and performance while maintaining model accuracy.

Deep neural networks generate and process large volumes of data, posing challenges for low-resource embedded systems. In-memory computing has been demonstrated as an efficient computing infrastructure and shows promise for embedded AI applications. Among newly-researched memory technologies, racetrack memory is a non-volatile technology that allows high data density fabrication, making it a good fit for in-memory computing. However, integrating in-memory arithmetic circuits with memory cells affects both the memory density and power efficiency. It remains challenging to build efficient in-memory arithmetic circuits on racetrack memory within area and energy constraints. To this end, we present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory. We design a series of fundamental arithmetic circuits as in-memory computing cells suited for multiply-and-accumulate operations. Moreover, we explore the design space of racetrack memory based systems and CNN model architectures, employing co-design to improve the efficiency and performance of performing CNN inference in racetrack memory while maintaining model accuracy. Our designed circuits and model-system co-optimization strategies achieve a small memory bank area with significant improvements in energy and performance for racetrack memory based embedded systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes