NILGSep 3, 2019

Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

arXiv:1909.00995v229 citations
AI Analysis

This addresses the reliability issue for distributed DNN deployments in edge-to-cloud computing, but it is incremental as it builds on existing concepts like residual connections.

The paper tackles the problem of unpredictable performance in distributed DNN inference due to node failures by introducing deepFogGuard, an architecture augmentation scheme that uses skip hyperconnections to provide failure-resiliency, with extensive experiments confirming its effectiveness in edge-cloud networks.

Partitioning and distributing deep neural networks (DNNs) over physical nodes such as edge, fog, or cloud nodes, could enhance sensor fusion, and reduce bandwidth and inference latency. However, when a DNN is distributed over physical nodes, failure of the physical nodes causes the failure of the DNN units that are placed on these nodes. The performance of the inference task will be unpredictable, and most likely, poor, if the distributed DNN is not specifically designed and properly trained for failures. Motivated by this, we introduce deepFogGuard, a DNN architecture augmentation scheme for making the distributed DNN inference task failure-resilient. To articulate deepFogGuard, we introduce the elements and a model for the resiliency of distributed DNN inference. Inspired by the concept of residual connections in DNNs, we introduce skip hyperconnections in distributed DNNs, which are the basis of deepFogGuard's design to provide resiliency. Next, our extensive experiments using two existing datasets for the sensing and vision applications confirm the ability of deepFogGuard to provide resiliency for distributed DNNs in edge-cloud networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes