DC AIMay 5, 2021

ScissionLite: Accelerating Distributed Deep Neural Networks Using Transfer Layer

Hyunho Ahn, Munkyu Lee, Cheol-Ho Hong, Blesson Varghese

arXiv:2105.02019v12.31 citations

Originality Incremental advance

AI Analysis

This work addresses performance issues for IIoT applications using distributed DNNs, offering a domain-specific improvement that is incremental in nature.

The paper tackles the bottleneck of low network performance in distributed deep neural network inference for Industrial Internet of Things applications by developing ScissionLite, a framework that uses a Transfer Layer to reduce outbound traffic, resulting in up to 16 times faster inference latency compared to local execution and 2.8 times faster than an existing state-of-the-art approach.

Industrial Internet of Things (IIoT) applications can benefit from leveraging edge computing. For example, applications underpinned by deep neural networks (DNN) models can be sliced and distributed across the IIoT device and the edge of the network for improving the overall performance of inference and for enhancing privacy of the input data, such as industrial product images. However, low network performance between IIoT devices and the edge is often a bottleneck. In this study, we develop ScissionLite, a holistic framework for accelerating distributed DNN inference using the Transfer Layer (TL). The TL is a traffic-aware layer inserted between the optimal slicing point of a DNN model slice in order to decrease the outbound network traffic without a significant accuracy drop. For the TL, we implement a new lightweight down/upsampling network for performance-limited IIoT devices. In ScissionLite, we develop ScissionTL, the Preprocessor, and the Offloader for end-to-end activities for deploying DNN slices with the TL. They decide the optimal slicing point of the DNN, prepare pre-trained DNN slices including the TL, and execute the DNN slices on an IIoT device and the edge. Employing the TL for the sliced DNN models has a negligible overhead. ScissionLite improves the inference latency by up to 16 and 2.8 times when compared to execution on the local device and an existing state-of-the-art model slicing approach respectively.

View on arXiv PDF

Similar