LGFeb 13, 2025

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training

Toon Vinck, Naïn Jonckers, Gert Dekkers, Jeffrey Prinzie, Peter Karsmakers

arXiv:2502.09374v17.11 citationsh-index: 4J Instrum

Originality Incremental advance

AI Analysis

This addresses reliability issues for DNNs in harsh environments like high radiation, but it is incremental as it builds on existing fault mitigation techniques.

The study tackled the problem of multiple single-bit upsets affecting deep neural networks in safety-critical applications by proposing fault-aware training, which improved fault tolerance by up to a factor of 3 without hardware changes.

Deep neural networks (DNNs) are increasingly used in safety-critical applications. Reliable fault analysis and mitigation are essential to ensure their functionality in harsh environments that contain high radiation levels. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a DNN model. Additionally, a fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware. Experimental results show that the FAT methodology improves the tolerance to faults up to a factor 3.

View on arXiv PDF

Similar