LGMLDec 20, 2019

Dependable Neural Networks for Safety Critical Tasks

arXiv:1912.09902v17 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of ensuring neural network reliability in safety-critical applications like autonomous vehicles, though it is incremental as it builds on existing verification and adaptation methods.

The paper tackles the problem of verifying neural networks in safety-critical systems by proposing metrics like ML Dependability to predict performance under novel operating conditions, and demonstrates a Safety Function that reduces harmful failures by a factor of 700 while maintaining high success probability.

Neural Networks are being integrated into safety critical systems, e.g., perception systems for autonomous vehicles, which require trained networks to perform safely in novel scenarios. It is challenging to verify neural networks because their decisions are not explainable, they cannot be exhaustively tested, and finite test samples cannot capture the variation across all operating conditions. Existing work seeks to train models robust to new scenarios via domain adaptation, style transfer, or few-shot learning. But these techniques fail to predict how a trained model will perform when the operating conditions differ from the testing conditions. We propose a metric, Machine Learning (ML) Dependability, that measures the network's probability of success in specified operating conditions which need not be the testing conditions. In addition, we propose the metrics Task Undependability and Harmful Undependability to distinguish network failures by their consequences. We evaluate the performance of a Neural Network agent trained using Reinforcement Learning in a simulated robot manipulation task. Our results demonstrate that we can accurately predict the ML Dependability, Task Undependability, and Harmful Undependability for operating conditions that are significantly different from the testing conditions. Finally, we design a Safety Function, using harmful failures identified during testing, that reduces harmful failures, in one example, by a factor of 700 while maintaining a high probability of success.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes