ARLGNov 22, 2020

Third ArchEdge Workshop: Exploring the Design Space of Efficient Deep Neural Networks

arXiv:2011.10912v1
AI Analysis

This work aims to improve the efficiency of deep neural networks for researchers and practitioners by identifying better accuracy-latency trade-offs and new dimensions for redundancy reduction.

This paper explores the design space of efficient deep neural networks (DNNs) by examining static architecture design efficiency through full-stack GPU core profiling and dynamic model execution efficiency by exploring DNN feature map redundancy at runtime.

This paper gives an overview of our ongoing work on the design space exploration of efficient deep neural networks (DNNs). Specifically, we cover two aspects: (1) static architecture design efficiency and (2) dynamic model execution efficiency. For static architecture design, different from existing end-to-end hardware modeling assumptions, we conduct full-stack profiling at the GPU core level to identify better accuracy-latency trade-offs for DNN designs. For dynamic model execution, different from prior work that tackles model redundancy at the DNN-channels level, we explore a new dimension of DNN feature map redundancy to be dynamically traversed at runtime. Last, we highlight several open questions that are poised to draw research attention in the next few years.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes