CVMay 8, 2021

Optimising Resource Management for Embedded Machine Learning

arXiv:2105.03608v113 citations
AI Analysis

This addresses the challenge of optimizing resource management for embedded systems, enabling better latency, privacy, and connectivity in mobile and IoT applications, though it appears incremental as it builds on existing scalability techniques.

The paper tackles the problem of managing heterogeneous multi-core resources for embedded machine learning inference by presenting online approaches that dynamically scale Deep Neural Networks to trade-off performance metrics like speed, energy, accuracy, and confidence, achieving consistent performance across different platforms.

Machine learning inference is increasingly being executed locally on mobile and embedded platforms, due to the clear advantages in latency, privacy and connectivity. In this paper, we present approaches for online resource management in heterogeneous multi-core systems and show how they can be applied to optimise the performance of machine learning workloads. Performance can be defined using platform-dependent (e.g. speed, energy) and platform-independent (accuracy, confidence) metrics. In particular, we show how a Deep Neural Network (DNN) can be dynamically scalable to trade-off these various performance metrics. Achieving consistent performance when executing on different platforms is necessary yet challenging, due to the different resources provided and their capability, and their time-varying availability when executing alongside other workloads. Managing the interface between available hardware resources (often numerous and heterogeneous in nature), software requirements, and user experience is increasingly complex.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes