CVAug 7, 2024
PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space ModelYunlong Huang, Junshuo Liu, Ke Xian et al.
Transformers have significantly advanced the field of 3D human pose estimation (HPE). However, existing transformer-based methods primarily use self-attention mechanisms for spatio-temporal modeling, leading to a quadratic complexity, unidirectional modeling of spatio-temporal relationships, and insufficient learning of spatial-temporal correlations. Recently, the Mamba architecture, utilizing the state space model (SSM), has exhibited superior long-range modeling capabilities in a variety of vision tasks with linear complexity. In this paper, we propose PoseMamba, a novel purely SSM-based approach with linear complexity for 3D human pose estimation in monocular video. Specifically, we propose a bidirectional global-local spatio-temporal SSM block that comprehensively models human joint relations within individual frames as well as temporal correlations across frames. Within this bidirectional global-local spatio-temporal SSM block, we introduce a reordering strategy to enhance the local modeling capability of the SSM. This strategy provides a more logical geometric scanning order and integrates it with the global SSM, resulting in a combined global-local spatial scan. We have quantitatively and qualitatively evaluated our approach using two benchmark datasets: Human3.6M and MPI-INF-3DHP. Extensive experiments demonstrate that PoseMamba achieves state-of-the-art performance on both datasets while maintaining a smaller model size and reducing computational costs. The code and models will be released.
CLJan 27, 2023
Semantic Network Model for Sign Language ComprehensionXinchen Kang, Dengfeng Yao, Minghu Jiang et al.
In this study, the authors propose a computational cognitive model for sign language (SL) perception and comprehension with detailed algorithmic descriptions based on cognitive functionalities in human language processing. The semantic network model (SNM) that represents semantic relations between concepts, it is used as a form of knowledge representation. The proposed model is applied in the comprehension of sign language for classifier predicates. The spreading activation search method is initiated by labeling a set of source nodes (e.g. concepts in the semantic network) with weights or "activation" and then iteratively propagating or "spreading" that activation out to other nodes linked to the source nodes. The results demonstrate that the proposed search method improves the performance of sign language comprehension in the SNM.
AIJul 31, 2024
TRGR: Transmissive RIS-aided Gait Recognition Through WallsYunlong Huang, Junshuo Liu, Jianan Zhang et al.
Gait recognition with radio frequency (RF) signals enables many potential applications requiring accurate identification. However, current systems require individuals to be within a line-of-sight (LOS) environment and struggle with low signal-to-noise ratio (SNR) when signals traverse concrete and thick walls. To address these challenges, we present TRGR, a novel transmissive reconfigurable intelligent surface (RIS)-aided gait recognition system. TRGR can recognize human identities through walls using only the magnitude measurements of channel state information (CSI) from a pair of transceivers. Specifically, by leveraging transmissive RIS alongside a configuration alternating optimization algorithm, TRGR enhances wall penetration and signal quality, enabling accurate gait recognition. Furthermore, a residual convolution network (RCNN) is proposed as the backbone network to learn robust human information. Experimental results confirm the efficacy of transmissive RIS, highlighting the significant potential of transmissive RIS in enhancing RF-based gait recognition systems. Extensive experiment results show that TRGR achieves an average accuracy of 97.88\% in identifying persons when signals traverse concrete walls, demonstrating the effectiveness and robustness of TRGR.
ROMar 6, 2019
Design of A Two-point Steering Path Planner Using Geometric ControlYunlong Huang
For lateral vehicle dynamics, planning trajectories for lane-keeping and lane-change can be generalized as a path planning task to stabilize a vehicle onto a target lane, which is a fundamental element in nowadays autonomous driving systems. On the other hand, two-point steering for lane-change and lane-keeping has been investigated by researchers from psychology as a sensorimotor mechanism of human drivers. In the first part of this paper, using knowledge of geometric control, we will first design a path planner which satisfies five design objectives: generalization for different vehicle models, convergence to the target lane, optimality, safety in lane-change maneuver and low computational complexity. Later, based on this path planner, a two-point steering path planner will be proposed and it will be proved rigorously that this two-point steering path planner possesses the advantage--steering radius of the planned trajectory is smaller than the intrinsic radius of reference line of the target lane. This advantage is also described as "corner-cutting" in driving. The smaller driving radius of the trajectory will result in higher vehicle speed along the winding roads and more comfortness for the passengers.