AIApr 4, 2023

PAC-Based Formal Verification for Out-of-Distribution Data Detection

arXiv:2304.01592v11 citationsh-index: 28Has Code
Originality Incremental advance
AI Analysis

It addresses safety-critical needs in domains like autonomous vehicles by providing formal verification for OOD detection, though it appears incremental as it builds on existing VAE-based methods.

This study tackles the problem of guaranteeing performance for out-of-distribution (OOD) detection in cyber-physical systems by using variational autoencoders and conformal constraints to bound detection error with user-defined confidence, verified on data from the CARLA driving simulator.

Cyber-physical systems (CPS) like autonomous vehicles, that utilize learning components, are often sensitive to noise and out-of-distribution (OOD) instances encountered during runtime. As such, safety critical tasks depend upon OOD detection subsystems in order to restore the CPS to a known state or interrupt execution to prevent safety from being compromised. However, it is difficult to guarantee the performance of OOD detectors as it is difficult to characterize the OOD aspect of an instance, especially in high-dimensional unstructured data. To distinguish between OOD data and data known to the learning component through the training process, an emerging technique is to incorporate variational autoencoders (VAE) within systems and apply classification or anomaly detection techniques on their latent spaces. The rationale for doing so is the reduction of the data domain size through the encoding process, which benefits real-time systems through decreased processing requirements, facilitates feature analysis for unstructured data and allows more explainable techniques to be implemented. This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs to quantify image features and apply conformal constraints over them. This is used to bound the detection error on unfamiliar instances with user-defined confidence. The approach used in this study is to empirically establish these bounds by sampling the latent probability distribution and evaluating the error with respect to the constraint violations that are encountered. The guarantee is then verified using data generated from CARLA, an open-source driving simulator.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes