AR AIAug 20, 2025

Computing-In-Memory Dataflow for Minimal Buffer Traffic

arXiv:2508.14375v11.2h-index: 4ICCD

Originality Incremental advance

AI Analysis

This addresses a bottleneck for edge AI devices by reducing data movement, though it is incremental as it optimizes an existing method for a specific challenge.

The paper tackles the problem of heavy buffer traffic in Computing-In-Memory (CIM) architectures for accelerating depthwise convolution in lightweight models like MobileNet and EfficientNet, resulting in a reduction of buffer traffic by 77.4-87.0% and improvements in energy and latency by 10.1-17.9% and 15.6-27.8%, respectively.

Computing-In-Memory (CIM) offers a potential solution to the memory wall issue and can achieve high energy efficiency by minimizing data movement, making it a promising architecture for edge AI devices. Lightweight models like MobileNet and EfficientNet, which utilize depthwise convolution for feature extraction, have been developed for these devices. However, CIM macros often face challenges in accelerating depthwise convolution, including underutilization of CIM memory and heavy buffer traffic. The latter, in particular, has been overlooked despite its significant impact on latency and energy consumption. To address this, we introduce a novel CIM dataflow that significantly reduces buffer traffic by maximizing data reuse and improving memory utilization during depthwise convolution. The proposed dataflow is grounded in solid theoretical principles, fully demonstrated in this paper. When applied to MobileNet and EfficientNet models, our dataflow reduces buffer traffic by 77.4-87.0%, leading to a total reduction in data traffic energy and latency by 10.1-17.9% and 15.6-27.8%, respectively, compared to the baseline (conventional weight-stationary dataflow).

View on arXiv PDF

Similar