PF AI AR LGJun 12, 2024

It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives

Alexander Louis-Ferdinand Jung, Jannik Steinmetz, Jonathan Gietz, Konstantin Lübeck, Oliver Bringmann

arXiv:2406.08330v11.2

Originality Incremental advance

AI Analysis

This work addresses the time and hardware availability challenges in benchmarking AI accelerators, offering a more efficient approach for performance modeling, though it is incremental in improving existing statistical methods.

The paper tackles the problem of reducing the large data requirements for training statistical performance models of AI hardware accelerators by proposing a methodology that uses Performance Representatives (PRs) to identify key DNN layers for benchmarking, achieving a Mean Absolute Percentage Error (MAPE) as low as 0.02% for single-layer estimations with less than 10,000 training samples.

Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amounts of data, leading to a significant time investment and can be difficult in case of limited hardware availability. To alleviate this problem, we propose a novel performance modeling methodology that significantly reduces the number of training samples while maintaining good accuracy. Our approach leverages knowledge of the target hardware architecture and initial parameter sweeps to identify a set of Performance Representatives (PR) for deep neural network (DNN) layers. These PRs are then used for benchmarking, building a statistical performance model, and making estimations. This targeted approach drastically reduces the number of training samples needed, opposed to random sampling, to achieve a better estimation accuracy. We achieve a Mean Absolute Percentage Error (MAPE) of as low as 0.02% for single-layer estimations and 0.68% for whole DNN estimations with less than 10000 training samples. The results demonstrate the superiority of our method for single-layer estimations compared to models trained with randomly sampled datasets of the same size.

View on arXiv PDF

Similar