LG ARMar 2, 2021

Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search

Kartik Hegde, Po-An Tsai, Sitao Huang, Vikas Chandra, Angshuman Parashar, Christopher W. Fletcher

arXiv:2103.01489v117.9117 citationsHas Code

Originality Highly original

AI Analysis

This addresses a core challenge in designing efficient specialized hardware for modern computing, offering a novel solution to a previously intractable search problem, though it is incremental in improving upon existing optimization schemes.

The paper tackles the problem of efficiently searching for optimal mappings from algorithms to specialized hardware accelerators, which is a large, non-convex, and non-smooth space. It proposes Mind Mappings, a gradient-based search method that achieves average energy-delay product improvements of 1.40× to 4.19× over prior methods like Simulated Annealing and Genetic Algorithms, and returns mappings within 5.32× of a theoretical lower-bound.

Modern day computing increasingly relies on specialization to satiate growing performance and efficiency requirements. A core challenge in designing such specialized hardware architectures is how to perform mapping space search, i.e., search for an optimal mapping from algorithm to hardware. Prior work shows that choosing an inefficient mapping can lead to multiplicative-factor efficiency overheads. Additionally, the search space is not only large but also non-convex and non-smooth, precluding advanced search techniques. As a result, previous works are forced to implement mapping space search using expert choices or sub-optimal search heuristics. This work proposes Mind Mappings, a novel gradient-based search method for algorithm-accelerator mapping space search. The key idea is to derive a smooth, differentiable approximation to the otherwise non-smooth, non-convex search space. With a smooth, differentiable approximation, we can leverage efficient gradient-based search algorithms to find high-quality mappings. We extensively compare Mind Mappings to black-box optimization schemes used in prior work. When tasked to find mappings for two important workloads (CNN and MTTKRP), the proposed search finds mappings that achieve an average $1.40\times$, $1.76\times$, and $1.29\times$ (when run for a fixed number of steps) and $3.16\times$, $4.19\times$, and $2.90\times$ (when run for a fixed amount of time) better energy-delay product (EDP) relative to Simulated Annealing, Genetic Algorithms and Reinforcement Learning, respectively. Meanwhile, Mind Mappings returns mappings with only $5.32\times$ higher EDP than a possibly unachievable theoretical lower-bound, indicating proximity to the global optima.

View on arXiv PDF Code

Similar