Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects
This addresses reliability issues for DNN-based systems, which are critical for safe deployment, but it is an incremental improvement over existing methods.
The paper tackles the problem of numerical defects in deep neural networks (DNNs) by proposing the RANUM approach for detection, confirmation, and fixing, showing it outperforms state-of-the-art methods on 63 real-world DNN benchmarks and generates fixes equivalent or better than human fixes in 37 out of 40 cases.
With the widespread deployment of deep neural networks (DNNs), ensuring the reliability of DNN-based systems is of great importance. Serious reliability issues such as system failures can be caused by numerical defects, one of the most frequent defects in DNNs. To assure high reliability against numerical defects, in this paper, we propose the RANUM approach including novel techniques for three reliability assurance tasks: detection of potential numerical defects, confirmation of potential-defect feasibility, and suggestion of defect fixes. To the best of our knowledge, RANUM is the first approach that confirms potential-defect feasibility with failure-exhibiting tests and suggests fixes automatically. Extensive experiments on the benchmarks of 63 real-world DNN architectures show that RANUM outperforms state-of-the-art approaches across the three reliability assurance tasks. In addition, when the RANUM-generated fixes are compared with developers' fixes on open-source projects, in 37 out of 40 cases, RANUM-generated fixes are equivalent to or even better than human fixes.