CROct 14, 2021

Bandwidth Utilization Side-Channel on ML Inference Accelerators

Sarbartha Banerjee, Shijia Wei, Prakash Ramrakhyani, Mohit Tiwari

arXiv:2110.07157v18.84 citations

Originality Incremental advance

AI Analysis

This reveals a security vulnerability for ML inference accelerators, potentially compromising model confidentiality in practical deployments.

The paper demonstrates that bandwidth utilization on the accelerator-weight storage interface can leak confidential ML model architecture as a side-channel, even with data and memory address encryption, and can be monitored via performance counters or bus contention.

Accelerators used for machine learning (ML) inference provide great performance benefits over CPUs. Securing confidential model in inference against off-chip side-channel attacks is critical in harnessing the performance advantage in practice. Data and memory address encryption has been recently proposed to defend against off-chip attacks. In this paper, we demonstrate that bandwidth utilization on the interface between accelerators and the weight storage can serve a side-channel for leaking confidential ML model architecture. This side channel is independent of the type of interface, leaks even in the presence of data and memory address encryption and can be monitored through performance counters or through bus contention from an on-chip unprivileged process.

View on arXiv PDF

Similar