LGCVMLSep 5, 2019

DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks

arXiv:1909.02563v143 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more effective testing tools for deep learning models in safety-critical applications like autonomous vehicles, representing an incremental improvement over existing methods.

The paper tackles the problem of generating diverse test cases for deep neural networks in safety-critical systems by proposing DeepEvolution, a search-based testing approach that uses metaheuristics. The result is that DeepEvolution significantly increases neuronal coverage, finds corner-case behaviors, and outperforms Tensorfuzz in detecting latent defects during model quantization.

The increasing inclusion of Deep Learning (DL) models in safety-critical systems such as autonomous vehicles have led to the development of multiple model-based DL testing techniques. One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria. So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. To overcome these limitations, we propose, DeepEvolution, a novel search-based approach for testing DL models that relies on metaheuristics to ensure a maximum diversity in generated test cases. We assess the effectiveness of DeepEvolution in testing computer-vision DL models and found that it significantly increases the neuronal coverage of generated test cases. Moreover, using DeepEvolution, we could successfully find several corner-case behaviors. Finally, DeepEvolution outperformed Tensorfuzz (a coverage-guided fuzzing tool developed at Google Brain) in detecting latent defects introduced during the quantization of the models. These results suggest that search-based approaches can help build effective testing tools for DL systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes