Bingzhuo Zhong

h-index6

11papers

10citations

Novelty52%

AI Score53

Ranked #12,934 of 194,257 authors (top 7%)#49 in SY (top 3%)

11 Papers

6.6SYApr 19

Controlled Invariant Sets for Gaussian Process State Space Models

Paul Griffioen, Bingzhuo Zhong, Murat Arcak et al.

We compute probabilistic controlled invariant sets for nonlinear systems using Gaussian process state space models, which are data-driven models that account for unmodeled and unknown nonlinear dynamics. We propose a semidefinite programming scheme for designing state-feedback controllers that maximize the probability of the trajectories staying within a probabilistic controlled invariant set while satisfying input constraints. The results are validated on a quadrotor, both in simulation and on a physical platform.

9.1SYMay 11Code

Lure-and-Reveal: An Exposure Framework for Stealthy Deception Attack in Multi-sensor Uncertain Systems

Meiqi Tian, Yihan Liu, Bingzhuo Zhong

Multi-sensor integration via error-state Kalman filter (KF) is widely employed for precise state estimation in cyber-physical systems (CPSs). However, this integration exposes the system to stealthy deception attacks that render conventional detection mechanisms ineffective. We propose an exposure framework to actively reveal such stealthy attacks without modifying sensor interfaces. The framework introduces a suspect mode in which the defender injects random exposure shakes into the nominal control inputs, thus creating a discrepancy between the defender's true state estimates and the attacker's manipulated state estimates, preventing the attack from remaining stealthy. We further derive an explicit exposure condition that characterizes the minimum shake magnitude to guarantee the finite-time exposure and a compensable condition that ensures the shakes do not degrade closed-loop performance. Simulation results based on a GNSS/INS-integrated UAV system verify the effectiveness of the proposed framework.

6.2ROMar 24Code

PHANTOM Hand

Teng Yan, Jiongxu Chen, Qixiang Hua et al.

Tendon-driven underactuated hands excel in adaptive grasping but often suffer from kinematic unpredictability and highly non-linear force transmission. This ambiguity limits their ability to perform precise free-motion shaping and deliver reliable payloads for complex manipulation tasks. To address this, we introduce the PHANTOM Hand (Hybrid Precision-Augmented Compliance): a modular, 1:1 human-scale system featuring 6 actuators and 15 degrees of freedom (DoFs). We propose a unified framework that bridges the gap between precise analytic shaping and robust compliant grasping. By deriving a sparse mapping from physical geometry and integrating a mechanics-based compensation model, we effectively suppress kinematic drift caused by spring counter-tension and tendon elasticity. This approach achieves sub-degree kinematic reproducibility for free-motion planning while retaining the inherent mechanical compliance required for stable physical interaction. Experimental validation confirms the system's capabilities through (1) kinematic analysis verifying sub-degree global accuracy across the workspace; (2) static expressibility tests demonstrating complex hand gestures; (3) diverse grasping experiments covering power, precision, and tool-use categories; and (4) quantitative fingertip force characterization. The results demonstrate that the PHANTOM hand successfully combines analytic kinematic precision with continuous, predictable force output, significantly expanding the payload and dexterity of underactuated hands. To drive the development of the underactuated manipulation ecosystem, all hardware designs and control scripts are fully open-sourced for community engagement.

7.1LOApr 10

Automatic Generation of Safety-compliant Linear Temporal Logic via Large Language Model: A Self-supervised Framework

Junle Li, Siqi Chen, Jiakai Li et al.

Converting high-level tasks described by natural language into formal specifications like Linear Temporal Logic (LTL) is a key step towards providing formal safety guarantees over cyber-physical systems (CPS). While the compliance of the formal specifications themselves against the safety restrictions imposed on CPS is crucial for ensuring safety, most existing works only focus on translation consistency between natural languages and formal specifications. In this paper, we introduce AutoSafeLTL, a self-supervised framework that utilizes large language models (LLMs) to automate the generation of LTL specifications complying with a set of safety restrictions while preserving their logical consistency and semantic accuracy. As a key insight, our framework integrates Language Inclusion check with an automated counterexample-guided modification mechanism to ensure the safety-compliance of the resulting LTL specifications. In particular, we develop 1) an LLM-as-an-Aligner, which performs atomic proposition matching between generated LTL specifications and safety restrictions to enforce semantic alignment; and 2) an LLM-as-a-Critic, which automates LTL specification refinement by interpreting counterexamples derived from Language Inclusion checks. Experimental results demonstrate that our architecture effectively guarantees safety-compliance for the generated LTL specifications, achieving a 0% violation rate against imposed safety restrictions. This shows the potential of our work in synergizing AI and formal verification techniques, enhancing safety-aware specification generation and automatic verification for both AI and critical CPS applications.

7.1ROMar 24

Agile-VLA: Few-Shot Industrial Pose Rectification via Implicit Affordance Anchoring

Teng Yan, Zhengyang Pei, Chengyu Shi et al.

Deploying Vision-Language-Action (VLA) models on resource-constrained edge platforms encounters a fundamental conflict between high-latency semantic inference and the high-frequency control required for dynamic manipulation. To address the challenge, this paper presents Agile-VLA, a hierarchical framework designed for industrial pose reorientation tasks on edge devices such as the NVIDIA Jetson Orin Nano. The core innovation is an Implicit Affordance Anchoring mechanism that directly maps geometric visual cues, specifically centroid and rim keypoint anchors, into structured parametric action primitives, thereby substantially reducing reliance on high-latency semantic inference during closed-loop control. By decoupling perception (10 Hz) from control (50 Hz) via an asynchronous dual-stream architecture, the system effectively mitigates the frequency mismatch inherent in edge-based robot learning. Experimental results on a standard 6-DoF manipulator demonstrate that Agile-VLA achieves robust rectification of complex, irregular workpieces using only 5-shot demonstrations through extrinsic dexterity.

5.7ROApr 2

Bridging Discrete Planning and Continuous Execution for Redundant Robot

Teng Yan, Yue Yu, Yihan Liu et al.

Voxel-grid reinforcement learning is widely adopted for path planning in redundant manipulators due to its simplicity and reproducibility. However, direct execution through point-wise numerical inverse kinematics on 7-DoF arms often yields step-size jitter, abrupt joint transitions, and instability near singular configurations. This work proposes a bridging framework between discrete planning and continuous execution without modifying the discrete planner itself. On the planning side, step-normalized 26-neighbor Cartesian actions and a geometric tie-breaking mechanism are introduced to suppress unnecessary turns and eliminate step-size oscillations. On the execution side, a task-priority damped least-squares (TP-DLS) inverse kinematics layer is implemented. This layer treats end-effector position as a primary task, while posture and joint centering are handled as subordinate tasks projected into the null space, combined with trust-region clipping and joint velocity constraints. On a 7-DoF manipulator in random sparse, medium, and dense environments, this bridge raises planning success in dense scenes from about 0.58 to 1.00, shortens representative path length from roughly 1.53 m to 1.10 m, and while keeping end-effector error below 1 mm, reduces peak joint accelerations by over an order of magnitude, substantially improving the continuous execution quality of voxel-based RL paths on redundant manipulators.

6.7SYMar 31

Communication-Aware Synthesis of Safety Controller for Networked Control Systems

Yihan Liu, Meiqi Tian, teng Yan et al.

Networked control systems (NCS) are widely used in safety-critical applications, but they are often analyzed under the assumption of ideal communication channels. This work focuses on the synthesis of safety controllers for discrete-time linear systems affected by unknown disturbances operating in imperfect communication channels. The proposed method guarantees safety by constructing ellipsoidal robust safety invariant (RSI) sets and verifying their invariance through linear matrix inequalities (LMI), which are formulated and solved as semi-definite programming (SDP). In particular, our framework simultaneously considers controller synthesis and communication errors without requiring explicit modeling of the communication channel. A case study on cruise control problem demonstrates that the proposed controller ensures safety in the presence of unexpected disturbances and multiple communication imperfections simultaneously.

5.8CVMar 24

PiCo: Active Manifold Canonicalization for Robust Robotic Visual Anomaly Detection

Teng Yan, Binkai Liu, Shuai Liu et al.

Industrial deployment of robotic visual anomaly detection (VAD) is fundamentally constrained by passive perception under diverse 6-DoF pose configurations and unstable operating conditions such as illumination changes and shadows, where intrinsic semantic anomalies and physical disturbances coexist and interact. To overcome these limitations, a paradigm shift from passive feature learning to Active Canonicalization is proposed. PiCo (Pose-in-Condition Canonicalization) is introduced as a unified framework that actively projects observations onto a condition-invariant canonical manifold. PiCo operates through a cascaded mechanism. The first stage, Active Physical Canonicalization, enables a robotic agent to reorient objects in order to reduce geometric uncertainty at its source. The second stage, Neural Latent Canonicalization, adopts a three-stage denoising hierarchy consisting of photometric processing at the input level, latent refinement at the feature level, and contextual reasoning at the semantic level, progressively eliminating nuisance factors across representational scales. Extensive evaluations on the large-scale M2AD benchmark demonstrate the superiority of this paradigm. PiCo achieves a state-of-the-art 93.7% O-AUROC, representing a 3.7% improvement over prior methods in static settings, and attains 98.5% accuracy in active closed-loop scenarios. These results demonstrate that active manifold canonicalization is critical for robust embodied perception.

7.4SYApr 30

Hierarchical Control for Continuous-time Systems via General Approximate Alternating Simulation Relations

Zhiyuan Huang, Shuo Li, Murat Arcak et al.

This paper introduces a general approximate alternating simulation relation (\emph{$\varepsilon$-gAAS relation}) for continuous-time systems, which relaxes existing simulation relations to tolerate larger mismatches between abstract and concrete models. The definition of gAAS for continuous-time systems is first proposed, and its properties are investigated. Then, a control refinement method is developed to enable hierarchical control for the gAAS relation. Finally, case studies demonstrate the effectiveness of the proposed approach, highlighting its advantages over existing methods.

3.3SYDec 2, 2024

Transfer Learning for Control Systems via Neural Simulation Relations

Alireza Nadali, Bingzhuo Zhong, Ashutosh Trivedi et al.

Transfer learning is an umbrella term for machine learning approaches that leverage knowledge gained from solving one problem (the source domain) to improve speed, efficiency, and data requirements in solving a different but related problem (the target domain). The performance of the transferred model in the target domain is typically measured via some notion of loss function in the target domain. This paper focuses on effectively transferring control logic from a source control system to a target control system while providing approximately similar behavioral guarantees in both domains. However, in the absence of a complete characterization of behavioral specifications, this problem cannot be captured in terms of loss functions. To overcome this challenge, we use (approximate) simulation relations to characterize observational equivalence between the behaviors of two systems. Simulation relations ensure that the outputs of both systems, equipped with their corresponding controllers, remain close to each other over time, and their closeness can be quantified {\it a priori}. By parameterizing simulation relations with neural networks, we introduce the notion of \emph{neural simulation relations}, which provides a data-driven approach to transfer any synthesized controller, regardless of the specification of interest, along with its proof of correctness. Compared with prior approaches, our method eliminates the need for a closed-loop mathematical model and specific requirements for both the source and target systems. We also introduce validity conditions that, when satisfied, guarantee the closeness of the outputs of two systems equipped with their corresponding controllers, thus eliminating the need for post-facto verification. We demonstrate the effectiveness of our approach through case studies involving a vehicle and a double inverted pendulum.

9.4CVMar 8

AR2-4FV: Anchored Referring and Re-identification for Long-Term Grounding in Fixed-View Videos

Teng Yan, Yihan Liu, Jiongxu Chen et al.

Long-term language-guided referring in fixed-view videos is challenging: the referent may be occluded or leave the scene for long intervals and later re-enter, while framewise referring pipelines drift as re-identification (ReID) becomes unreliable. AR2-4FV leverages background stability for long-term referring. An offline Anchor Bank is distilled from static background structures; at inference, the text query is aligned with this bank to produce an Anchor Map that serves as persistent semantic memory when the referent is absent. An anchor-based re-entry prior accelerates re-capture upon return, and a lightweight ReID-Gating mechanism maintains identity continuity using displacement cues in the anchor frame. The system predicts per-frame bounding boxes without assuming the target is visible in the first frame or explicitly modeling appearance variations. AR2-4FV achieves +10.3% Re-Capture Rate (RCR) improvement and -24.2% Re-Capture Latency (RCL) reduction over the best baseline, and ablation studies further confirm the benefits of the Anchor Map, re-entry prior, and ReID-Gating.