Vision-Based Runtime Monitoring under Varying Specifications using Semantic Latent Representations

arXiv:2605.1392335.7
Predicted impact top 65% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For safety-critical autonomous systems requiring runtime monitoring from visual inputs, this work provides a reusable monitoring framework with formal guarantees, though the approach is domain-specific to signal temporal logic fragments.

This work introduces reusable runtime monitors for vision-based ptSTL specifications that provide finite-sample guarantees without per-formula retraining. The semantic-basis monitor achieves up to 4× tighter certified bounds at long horizons compared to a rolling prediction monitor on a pedestrian-crossroad benchmark, and both satisfy conformal coverage guarantees on Waymo driving data.

We study certified runtime monitoring of past-time signal temporal logic (ptSTL) from visual observations under partial observability. The monitor must infer safety-relevant quantities from images and provide finite-sample guarantees, while being \emph{reusable}: once trained and calibrated, it should certify any formula in a target fragment without per-formula retraining. For fragments induced by a finite dictionary of temporal atoms, we prove that the \emph{semantic basis}, the vector of atom robustness scores, is the minimum prediction target within the class of monotone, 1-Lipschitz reusable interfaces: any formula is evaluated by a deterministic decoder derived from the parse tree, and a single conformal calibration pass certifies the entire fragment with no union bound. We also introduce a \emph{rolling prediction monitor} that predicts only current predicate values and reconstructs temporal history online; this is easier to learn but grows conservative at long horizons. On a pedestrian-crossroad benchmark, rolling achieves tighter certified bounds at short horizons while the semantic-basis monitor is up to 4-times tighter at long horizons. We validate the presented monitors on real-world Waymo driving data, where both monitors satisfy the conformal coverage guarantee empirically.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes