Adaptive Soft Error Protection for Neural Network Processing
This work addresses computational overhead in fault-tolerant neural network processing for applications requiring high reliability, representing an incremental improvement over prior static approaches.
The paper tackled the problem of mitigating soft errors in neural networks by revealing that vulnerability is input-dependent and proposing a lightweight GNN model for runtime vulnerability prediction, achieving a 42.12% average reduction in computational overhead compared to static methods while maintaining reliability.
Mitigating soft errors in neural networks (NNs) often incurs significant computational overhead. Traditional methods mainly explored static vulnerability variations across NN components, employing selective protection to minimize costs. In contrast, this work reveals that NN vulnerability is also input-dependent, exhibiting dynamic variations at runtime. To this end, we propose a lightweight graph neural network (GNN) model capable of capturing input- and component-specific vulnerability to soft errors. This model facilitates runtime vulnerability prediction, enabling an adaptive protection strategy that dynamically adjusts to varying vulnerabilities. The approach complements classical fault-tolerant techniques by tailoring protection efforts based on real-time vulnerability assessments. Experimental results across diverse datasets and NNs demonstrate that our adaptive protection method achieves a 42.12\% average reduction in computational overhead compared to prior static vulnerability-based approaches, without compromising reliability.