Testing Deep Neural Networks
This addresses the need for thorough testing of DNNs in safety-critical domains, offering incremental improvements over existing methods.
The paper tackles the problem of testing deep neural networks (DNNs) by proposing four novel test criteria tailored to DNN structures, validated through experiments on state-of-the-art DNNs with datasets like MNIST and CIFAR-10, showing a balance between bug-finding ability and computational cost.
Deep neural networks (DNNs) have a wide range of applications, and software employing them must be thoroughly tested, especially in safety-critical domains. However, traditional software test coverage metrics cannot be applied directly to DNNs. In this paper, inspired by the MC/DC coverage criterion, we propose a family of four novel test criteria that are tailored to structural features of DNNs and their semantics. We validate the criteria by demonstrating that the generated test inputs guided via our proposed coverage criteria are able to capture undesired behaviours in a DNN. Test cases are generated using a symbolic approach and a gradient-based heuristic search. By comparing them with existing methods, we show that our criteria achieve a balance between their ability to find bugs (proxied using adversarial examples) and the computational cost of test case generation. Our experiments are conducted on state-of-the-art DNNs obtained using popular open source datasets, including MNIST, CIFAR-10 and ImageNet.