Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical Systems
This addresses the problem of under-estimation in safety testing for engineers deploying learning-based systems, though it is incremental as it builds on existing Importance Sampling methods.
The paper tackles the challenge of evaluating rare safety-critical events in black-box intelligent systems by proposing Deep-Prabilistic Accelerated Evaluation (Deep-PrAE), a framework that converts versatile but unguaranteed samplers into ones with relaxed efficiency certificates, enabling accurate probability bounds and demonstrating effectiveness in testing an intelligent driving algorithm.
Evaluating the reliability of intelligent physical systems against rare safety-critical events poses a huge testing burden for real-world applications. Simulation provides a useful platform to evaluate the extremal risks of these systems before their deployments. Importance Sampling (IS), while proven to be powerful for rare-event simulation, faces challenges in handling these learning-based systems due to their black-box nature that fundamentally undermines its efficiency guarantee, which can lead to under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the safety-critical event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of an intelligent driving algorithm.