Data Synthesis for Testing Black-Box Machine Learning Models
This work addresses the reliability issue for users and developers of black-box ML models, but it appears incremental as it builds on existing testing methods.
The paper tackles the problem of insufficient testing for black-box machine learning models by introducing a framework for automated test data synthesis, which generates realistic and controllable data to test various properties, and experimentally demonstrates its effectiveness.
The increasing usage of machine learning models raises the question of the reliability of these models. The current practice of testing with limited data is often insufficient. In this paper, we provide a framework for automated test data synthesis to test black-box ML/DL models. We address an important challenge of generating realistic user-controllable data with model agnostic coverage criteria to test a varied set of properties, essentially to increase trust in machine learning models. We experimentally demonstrate the effectiveness of our technique.