LGAIARMLSep 14, 2020

DANCE: Differentiable Accelerator/Network Co-Exploration

arXiv:2009.06237v350 citations
AI Analysis

This work addresses the challenge of efficiently co-designing DNNs and specialized accelerators for researchers and practitioners in machine learning and hardware design, representing a novel method rather than an incremental improvement.

The paper tackles the chicken-and-egg problem of optimizing neural network architectures and hardware accelerators simultaneously by introducing DANCE, a differentiable co-exploration method that uses a neural network to model hardware metrics, achieving superior accuracy and hardware cost metrics in significantly shorter time compared to naive approaches.

To cope with the ever-increasing computational demand of the DNN execution, recent neural architecture search (NAS) algorithms consider hardware cost metrics into account, such as GPU latency. To further pursue a fast, efficient execution, DNN-specialized hardware accelerators are being designed for multiple purposes, which far-exceeds the efficiency of the GPUs. However, those hardware-related metrics have been proven to exhibit non-linear relationships with the network architectures. Therefore it became a chicken-and-egg problem to optimize the network against the accelerator, or to optimize the accelerator against the network. In such circumstances, this work presents DANCE, a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design. At the heart of DANCE is a differentiable evaluator network. By modeling the hardware evaluation software with a neural network, the relation between the accelerator architecture and the hardware metrics becomes differentiable, allowing the search to be performed with backpropagation. Compared to the naive existing approaches, our method performs co-exploration in a significantly shorter time, while achieving superior accuracy and hardware cost metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes