Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning
This addresses fault tolerance for CNNs in safety-critical domains, offering a hardware-agnostic solution, though it is incremental as it builds on existing model-level techniques.
The paper tackles the problem of fault tolerance in CNNs for safety-critical applications by introducing a model-level hardening approach that integrates error correction, achieving fault resilience nearly equivalent to TMR with reduced overhead, and a pruning technique that yields up to 24% faster performance with negligible accuracy loss.
Convolutional Neural Networks (CNNs) have become integral in safety-critical applications, thus raising concerns about their fault tolerance. Conventional hardware-dependent fault tolerance methods, such as Triple Modular Redundancy (TMR), are computationally expensive, imposing a remarkable overhead on CNNs. Whereas fault tolerance techniques can be applied either at the hardware level or at the model levels, the latter provides more flexibility without sacrificing generality. This paper introduces a model-level hardening approach for CNNs by integrating error correction directly into the neural networks. The approach is hardware-agnostic and does not require any changes to the underlying accelerator device. Analyzing the vulnerability of parameters enables the duplication of selective filters/neurons so that their output channels are effectively corrected with an efficient and robust correction layer. The proposed method demonstrates fault resilience nearly equivalent to TMR-based correction but with significantly reduced overhead. Nevertheless, there exists an inherent overhead to the baseline CNNs. To tackle this issue, a cost-effective parameter vulnerability based pruning technique is proposed that outperforms the conventional pruning method, yielding smaller networks with a negligible accuracy loss. Remarkably, the hardened pruned CNNs perform up to 24\% faster than the hardened un-pruned ones.