LGApr 25, 2021

Balancing Accuracy and Latency in Multipath Neural Networks

arXiv:2104.12040v1
Originality Incremental advance
AI Analysis

This work addresses the need for efficient neural networks in hand-held and IoT devices, but it is incremental as it builds on existing architecture search and pruning techniques.

The paper tackles the problem of balancing accuracy and latency in neural networks for limited-resource environments by using a one-shot neural architecture search to evaluate multipath networks, showing that it can accurately model and predict performance across different datasets.

The growing capacity of neural networks has strongly contributed to their success at complex machine learning tasks and the computational demand of such large models has, in turn, stimulated a significant improvement in the hardware necessary to accelerate their computations. However, models with high latency aren't suitable for limited-resource environments such as hand-held and IoT devices. Hence, many deep learning techniques aim to address this problem by developing models with reasonable accuracy without violating the limited-resource constraint. In this work, we use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of multipath neural networks. Combining this architecture search with a pruning technique and architecture sample evaluation, we can model the relation between the accuracy and the latency of a spectrum of models with graded complexity. We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes