LGNov 25, 2019

Distortion and Faults in Machine Learning Software

arXiv:1911.11596v14 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of detecting hidden faults in machine learning software for developers and researchers, but it is incremental as it builds on existing validation methods.

The paper tackles the problem of quality assurance in deep neural network software by hypothesizing that faults in learning programs manifest as distortions in trained models, and demonstrates this with example cases on the MNIST dataset.

Machine learning software, deep neural networks (DNN) software in particular, discerns valuable information from a large dataset, a set of data. Outcomes of such DNN programs are dependent on the quality of both learning programs and datasets. Unfortunately, the quality of datasets is difficult to be defined, because they are just samples. The quality assurance of DNN software is difficult, because resultant trained machine learning models are unknown prior to its development, and the validation is conducted indirectly in terms of prediction performance. This paper introduces a hypothesis that faults in the learning programs manifest themselves as distortions in trained machine learning models. Relative distortion degrees measured with appropriate observer functions may indicate that there are some hidden faults. The proposal is demonstrated with example cases of the MNIST dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes