CVApr 3, 2023

D-Score: A White-Box Diagnosis Score for CNNs Based on Mutation Operators

arXiv:2304.00697v12 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the need for more trustworthy CNN evaluation in safety-critical domains like autonomous driving and medical diagnosis, though it is incremental as it builds on prior mutation testing methods.

The paper tackles the problem of unreliable evaluation of convolutional neural networks (CNNs) due to low-quality test sets by proposing D-Score, a white-box diagnostic approach that uses mutation operators and image transformations to assess model robustness and fitness, with experiments on two datasets and three CNNs showing effectiveness.

Convolutional neural networks (CNNs) have been widely applied in many safety-critical domains, such as autonomous driving and medical diagnosis. However, concerns have been raised with respect to the trustworthiness of these models: The standard testing method evaluates the performance of a model on a test set, while low-quality and insufficient test sets can lead to unreliable evaluation results, which can have unforeseeable consequences. Therefore, how to comprehensively evaluate CNNs and, based on the evaluation results, how to enhance their trustworthiness are the key problems to be urgently addressed. Prior work has used mutation tests to evaluate the test sets of CNNs. However, the evaluation scores are black boxes and not explicit enough for what is being tested. In this paper, we propose a white-box diagnostic approach that uses mutation operators and image transformation to calculate the feature and attention distribution of the model and further present a diagnosis score, namely D-Score, to reflect the model's robustness and fitness to a dataset. We also propose a D-Score based data augmentation method to enhance the CNN's performance to translations and rescalings. Comprehensive experiments on two widely used datasets and three commonly adopted CNNs demonstrate the effectiveness of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes