Yuang Geng

LG
h-index5
6papers
26citations
Novelty57%
AI Score46

6 Papers

LGNov 8, 2023
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers

Yuang Geng, Jake Brandon Baldauf, Souradeep Dutta et al.

Autonomous systems are increasingly implemented using end-to-end learning-based controllers. Such controllers make decisions that are executed on the real system, with images as one of the primary sensing modalities. Deep neural networks form a fundamental building block of such controllers. Unfortunately, the existing neural-network verification tools do not scale to inputs with thousands of dimensions -- especially when the individual inputs (such as pixels) are devoid of clear physical meaning. This paper takes a step towards connecting exhaustive closed-loop verification with high-dimensional controllers. Our key insight is that the behavior of a high-dimensional controller can be approximated with several low-dimensional controllers. To balance the approximation accuracy and verifiability of our low-dimensional controllers, we leverage the latest verification-aware knowledge distillation. Then, we inflate low-dimensional reachability results with statistical approximation errors, yielding a high-confidence reachability guarantee for the high-dimensional controller. We investigate two inflation techniques -- based on trajectories and control actions -- both of which show convincing performance in three OpenAI gym benchmarks.

8.7CVApr 1
Deterministic World Models for Verification of Closed-loop Vision-based Systems

Yuang Geng, Zhuoyang Zhou, Zhongzheng Zhang et al.

Verifying closed-loop vision-based control systems remains a fundamental challenge due to the high dimensionality of images and the difficulty of modeling visual environments. While generative models are increasingly used as camera surrogates in verification, their reliance on stochastic latent variables introduces unnecessary overapproximation error. To address this bottleneck, we propose a Deterministic World Model (DWM) that maps system states directly to generative images, effectively eliminating uninterpretable latent variables to ensure precise input bounds. The DWM is trained with a dual-objective loss function that combines pixel-level reconstruction accuracy with a control difference loss to maintain behavioral consistency with the real system. We integrate DWM into a verification pipeline utilizing Star-based reachability analysis (StarV) and employ conformal prediction to derive rigorous statistical bounds on the trajectory deviation between the world model and the actual vision-based system. Experiments on standard benchmarks show that our approach yields significantly tighter reachable sets and better verification performance than a latent-variable baseline.

6.1CVMar 25
MLE-UVAD: Minimal Latent Entropy Autoencoder for Fully Unsupervised Video Anomaly Detection

Yuang Geng, Junkai Zhou, Kang Yang et al.

In this paper, we address the challenging problem of single-scene, fully unsupervised video anomaly detection (VAD), where raw videos containing both normal and abnormal events are used directly for training and testing without any labels. This differs sharply from prior work that either requires extensive labeling (fully or weakly supervised) or depends on normal-only videos (one-class classification), which are vulnerable to distribution shifts and contamination. We propose an entropy-guided autoencoder that detects anomalies through reconstruction error by reconstructing normal frames well while making anomalies reconstruct poorly. The key idea is to combine the standard reconstruction loss with a novel Minimal Latent Entropy (MLE) loss in the autoencoder. Reconstruction loss alone maps normal and abnormal inputs to distinct latent clusters due to their inherent differences, but also risks reconstructing anomalies too well to detect. Therefore, MLE loss addresses this by minimizing the entropy of latent embeddings, encouraging them to concentrate around high-density regions. Since normal frames dominate the raw video, sparse anomalous embeddings are pulled into the normal cluster, so the decoder emphasizes normal patterns and produces poor reconstructions for anomalies. This dual-loss design produces a clear reconstruction gap that enables effective anomaly detection. Extensive experiments on two widely used benchmarks and a challenging self-collected driving dataset demonstrate that our method achieves robust and superior performance over baselines.

8.7SYApr 23
Statistical-Symbolic Verification of Perception-Based Autonomous Systems using State-Dependent Conformal Prediction

Yuang Geng, Thomas Waite, Trevor Turnquist et al.

Reachability analysis has been a prominent way to provide safety guarantees for neurally controlled autonomous systems, but its direct application to neural perception components is infeasible due to imperfect or intractable perception models. Typically, this issue has been bypassed by complementing reachability with statistical analysis of perception error, say with conformal prediction (CP). However, existing CP methods for time-series data often provide conservative bounds. The corresponding error accumulation over time has made it challenging to combine statistical bounds with symbolic reachability in a way that is provable, scalable, and minimally conservative. To reduce conservatism and improve scalability, our key insight is that perception error varies significantly with the system's dynamical state. This article proposes state-dependent conformal prediction, which exploits that dependency in constructing tight high-confidence bounds on perception error. Based on this idea, we provide an approach to partition the state space, using a genetic algorithm, so as to optimize the tightness of conformal bounds. Finally, since using these bounds in reachability analysis leads to additional uncertainty and branching in the resulting hybrid system, we propose a branch-merging reachability algorithm that trades off uncertainty for scalability so as to enable scalable and tight verification. The evaluation of our verification methodology on two complementary case studies demonstrates reduced conservatism compared to the state of the art.

LGMar 4, 2025
Four Principles for Physically Interpretable World Models

Jordan Peper, Zhenjiang Mao, Yuang Geng et al.

As autonomous systems are increasingly deployed in open and uncertain settings, there is a growing need for trustworthy world models that can reliably predict future high-dimensional observations. The learned latent representations in world models lack direct mapping to meaningful physical quantities and dynamics, limiting their utility and interpretability in downstream planning, control, and safety verification. In this paper, we argue for a fundamental shift from physically informed to physically interpretable world models - and crystallize four principles that leverage symbolic knowledge to achieve these ends: (1) functionally organizing the latent space according to the physical intent, (2) learning aligned invariant and equivariant representations of the physical world, (3) integrating multiple forms and strengths of supervision into a unified training process, and (4) partitioning generative outputs to support scalability and verifiability. We experimentally demonstrate the value of each principle on two benchmarks. This paper opens several intriguing research directions to achieve and capitalize on full physical interpretability in world models.

LGMar 30, 2024
Zero-shot Safety Prediction for Autonomous Robots with Foundation World Models

Zhenjiang Mao, Siqi Dai, Yuang Geng et al.

A world model creates a surrogate world to train a controller and predict safety violations by learning the internal dynamic model of systems. However, the existing world models rely solely on statistical learning of how observations change in response to actions, lacking precise quantification of how accurate the surrogate dynamics are, which poses a significant challenge in safety-critical systems. To address this challenge, we propose foundation world models that embed observations into meaningful and causally latent representations. This enables the surrogate dynamics to directly predict causal future states by leveraging a training-free large language model. In two common benchmarks, this novel model outperforms standard world models in the safety prediction task and has a performance comparable to supervised learning despite not using any data. We evaluate its performance with a more specialized and system-relevant metric by comparing estimated states instead of aggregating observation-wide error.