PFDCLGMay 21, 2019

Performance Analysis of Deep Learning Workloads on Leading-edge Systems

arXiv:1905.08764v223 citations
Originality Synthesis-oriented
AI Analysis

This provides a performance comparison for researchers and practitioners choosing systems for deep learning tasks, but it is incremental as it applies existing methods to new hardware.

This work analyzed the performance of leading-edge systems like NVIDIA DGX-2 and AWS P3 on deep learning workloads from computer vision and NLP, examining factors such as communication interconnects and optimizations to compare their effectiveness.

This work examines the performance of leading-edge systems designed for machine learning computing, including the NVIDIA DGX-2, Amazon Web Services (AWS) P3, IBM Power System Accelerated Compute Server AC922, and a consumer-grade Exxact TensorEX TS4 GPU server. Representative deep learning workloads from the fields of computer vision and natural language processing are the focus of the analysis. Performance analysis is performed along with a number of important dimensions. Performance of the communication interconnects and large and high-throughput deep learning models are considered. Different potential use models for the systems as standalone and in the cloud also are examined. The effect of various optimization of the deep learning models and system configurations is included in the analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes