Qingkai Yang

CV
h-index3
3papers
64citations
Novelty62%
AI Score49

3 Papers

ROApr 20
StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement

Kerui Li, Zhe Jing, Xiaofeng Wang et al.

Inverse Dynamics Models (IDMs) map visual observations to low-level action commands, serving as central components for data labeling and policy execution in embodied AI. However, their performance degrades severely under manipulator truncation, a common failure mode that makes state recovery ill-posed and leads to unstable control. We present StableIDM, a spatio-temporal framework that refines features from visual inputs to stabilize action predictions under such partial observability. StableIDM integrates three complementary components: (1) auxiliary robot-centric masking to suppress background clutter, (2) Directional Feature Aggregation (DFA) for geometry-aware spatial reasoning, which extracts anisotropic features along directions inferred from the visible arm and (3) Temporal Dynamics Refinement (TDR) to smooth and correct predictions via motion continuity. Extensive evaluations validate our approach: StableIDM improves strict action accuracy by 12.1% under severe truncation on the AgiBot benchmark, and increases average task success by 9.7% in real-robot replay. Moreover, it boosts end-to-end grasp success by 11.5% when decoding video-generated plans, and improves downstream VLA real-robot success by 17.6% when functioning as an automatic annotator. These results demonstrate that StableIDM provides a robust and scalable backbone for both policy execution and data generation in embodied artificial intelligence.

CVSep 12, 2025Code
ISTASTrack: Bridging ANN and SNN via ISTA Adapter for RGB-Event Tracking

Siying Liu, Zikai Wang, Hanle Zheng et al.

RGB-Event tracking has become a promising trend in visual object tracking to leverage the complementary strengths of both RGB images and dynamic spike events for improved performance. However, existing artificial neural networks (ANNs) struggle to fully exploit the sparse and asynchronous nature of event streams. Recent efforts toward hybrid architectures combining ANNs and spiking neural networks (SNNs) have emerged as a promising solution in RGB-Event perception, yet effectively fusing features across heterogeneous paradigms remains a challenge. In this work, we propose ISTASTrack, the first transformer-based \textbf{A}NN-\textbf{S}NN hybrid \textbf{Track}er equipped with \textbf{ISTA} adapters for RGB-Event tracking. The two-branch model employs a vision transformer to extract spatial context from RGB inputs and a spiking transformer to capture spatio-temporal dynamics from event streams. To bridge the modality and paradigm gap between ANN and SNN features, we systematically design a model-based ISTA adapter for bidirectional feature interaction between the two branches, derived from sparse representation theory by unfolding the iterative shrinkage thresholding algorithm. Additionally, we incorporate a temporal downsampling attention module within the adapter to align multi-step SNN features with single-step ANN features in the latent space, improving temporal fusion. Experimental results on RGB-Event tracking benchmarks, such as FE240hz, VisEvent, COESOT, and FELT, have demonstrated that ISTASTrack achieves state-of-the-art performance while maintaining high energy efficiency, highlighting the effectiveness and practicality of hybrid ANN-SNN designs for robust visual tracking. The code is publicly available at https://github.com/lsying009/ISTASTrack.git.

SYJun 29, 2019
Distributed Global Output-Feedback Control for a Class of Euler-Lagrange Systems

Qingkai Yang, Hao Fang, Jie Chen et al.

This published paper investigates the distributed tracking control problem for a class of Euler-Lagrange multi-agent systems when the agents can only measure the positions. In this case, the lack of the separation principle and the strong nonlinearity in unmeasurable states pose severe technical challenges to global output-feedback control design. To overcome these difficulties, a global nonsingular coordinate transformation matrix in the upper triangular form is firstly proposed such that the nonlinear dynamic model can be partially linearized with respect to the unmeasurable states. And, a new type of velocity observers is designed to estimate the unmeasurable velocities for each system. Then, based on the outputs of the velocity observers, we propose distributed control laws that enable the coordinated tracking control system to achieve uniform global exponential stability (UGES). Both theoretical analysis and numerical simulations are presented to validate the effectiveness of the proposed control scheme. Followed by the original paper, a typo and a mistake is corrected.