SEAIMay 25, 2023

Rethinking Diversity in Deep Neural Network Testing

arXiv:2305.15698v22 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of effectively testing DNNs for robustness, offering a more precise approach that could improve reliability in safety-critical applications, though it is incremental in refining existing testing methodologies.

The paper tackles the problem of testing deep neural networks by shifting from diversity-based measures to directed testing that prioritizes inputs likely to cause misclassifications, showing that their directed metrics consistently outperform diversity metrics in revealing errors.

Motivated by the success of traditional software testing, numerous diversity measures have been proposed for testing deep neural networks (DNNs). In this study, we propose a shift in perspective, advocating for the consideration of DNN testing as directed testing problems rather than diversity-based testing tasks. We note that the objective of testing DNNs is specific and well-defined: identifying inputs that lead to misclassifications. Consequently, a more precise testing approach is to prioritize inputs with a higher potential to induce misclassifications, as opposed to emphasizing inputs that enhance "diversity." We derive six directed metrics for DNN testing. Furthermore, we conduct a careful analysis of the appropriate scope for each metric, as applying metrics beyond their intended scope could significantly diminish their effectiveness. Our evaluation demonstrates that (1) diversity metrics are particularly weak indicators for identifying buggy inputs resulting from small input perturbations, and (2) our directed metrics consistently outperform diversity metrics in revealing erroneous behaviors of DNNs across all scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes