SEAICRLGDec 13, 2018

DeepCruiser: Automated Guided Testing for Stateful Deep Learning Systems

arXiv:1812.05339v143 citations
Originality Incremental advance
AI Analysis

This addresses quality assurance challenges for developers of RNN-based systems in applications like audio and natural language processing, representing an incremental step in testing beyond feed-forward networks.

The paper tackles the problem of testing stateful deep learning systems, specifically recurrent neural networks (RNNs), by proposing DeepCruiser, an automated testing framework that generates tests guided by coverage criteria, and demonstrates its effectiveness in improving quality and reliability on a speech-to-text system.

Deep learning (DL) defines a data-driven programming paradigm that automatically composes the system decision logic from the training data. In company with the data explosion and hardware acceleration during the past decade, DL achieves tremendous success in many cutting-edge applications. However, even the state-of-the-art DL systems still suffer from quality and reliability issues. It was only until recently that some preliminary progress was made in testing feed-forward DL systems. In contrast to feed-forward DL systems, recurrent neural networks (RNN) follow a very different architectural design, implementing temporal behaviors and memory with loops and internal states. Such stateful nature of RNN contributes to its success in handling sequential inputs such as audio, natural languages and video processing, but also poses new challenges for quality assurance. In this paper, we initiate the very first step towards testing RNN-based stateful DL systems. We model RNN as an abstract state transition system, based on which we define a set of test coverage criteria specialized for stateful DL systems. Moreover, we propose an automated testing framework, DeepCruiser, which systematically generates tests in large scale to uncover defects of stateful DL systems with coverage guidance. Our in-depth evaluation on a state-of-the-art speech-to-text DL system demonstrates the effectiveness of our technique in improving quality and reliability of stateful DL systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes