LGARSPOct 21, 2024

DeepVigor+: Scalable and Accurate Semi-Analytical Fault Resilience Analysis for Deep Neural Network

arXiv:2410.15742v2h-index: 20
Originality Incremental advance
AI Analysis

This work addresses the need for efficient hardware reliability assessment in safety-critical ML applications, offering a scalable solution for deep CNNs, though it appears incremental as an improvement over existing statistical methods.

The paper tackles the problem of time-consuming fault injection for reliability analysis in large convolutional neural networks by introducing DeepVigor+, a semi-analytical method that achieves less than 1% error in vulnerability factors while requiring 14.9 to 26.9 times fewer simulations than state-of-the-art statistical fault injection, enabling analysis in minutes instead of days or weeks.

The growing exploitation of Machine Learning (ML) in safety-critical applications necessitates rigorous safety analysis. Hardware reliability assessment is a major concern with respect to measuring the level of safety in ML-based systems. Quantifying the reliability of emerging ML models, including Convolutional Neural Networks (CNNs), is highly complex due to their enormous size in terms of the number of parameters and computations. Conventionally, Fault Injection (FI) is applied to perform a reliability measurement. However, performing FI on modern-day CNNs is prohibitively time-consuming if an acceptable confidence level is to be achieved. To speed up FI for large CNNs, statistical FI (SFI) has been proposed, but its runtimes are still considerably long. In this work, we introduce DeepVigor+, a scalable, fast, and accurate semi-analytical method as an efficient alternative for reliability measurement in CNNs. DeepVigor+ implements a fault propagation analysis model and attempts to acquire Vulnerability Factors (VFs) as reliability metrics in an optimal way. The results indicate that DeepVigor+ obtains VFs for CNN models with an error less than $1\%$, i.e., the objective in SFI, but with $14.9$ up to $26.9$ times fewer simulations than the best-known state-of-the-art SFI. DeepVigor+ enables an accurate reliability analysis for large and deep CNNs within a few minutes, rather than achieving the same results in days or weeks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes