CVMar 29
MoViD: View-Invariant 3D Human Pose Estimation via Motion-View DisentanglementYejia Liu, Hengle Jiang, Haoxian Liu et al.
3D human pose estimation is a key enabling technology for applications such as healthcare monitoring, human-robot collaboration, and immersive gaming, but real-world deployment remains challenged by viewpoint variations. Existing methods struggle to generalize to unseen camera viewpoints, require large amounts of training data, and suffer from high inference latency. We propose MoViD, a viewpoint-invariant 3D human pose estimation framework that disentangles viewpoint information from motion features. The key idea is to extract viewpoint information from intermediate pose features and leverage it to enhance both the robustness and efficiency of pose estimation. MoViD introduces a view estimator that models key joint relationships to predict viewpoint information, and an orthogonal projection module to disentangle motion and view features, further enhanced through physics-grounded contrastive alignment across views. For real-time edge deployment, MoViD employs a frame-by-frame inference pipeline with a view-aware strategy that adaptively activates flip refinement based on the estimated viewpoint. Evaluations on nine public datasets and newly collected multiview UAV and gait analysis datasets show that MoViD reduces pose estimation error by over 24.2\% compared to state-of-the-art methods, maintains robust performance under severe occlusions with 60\% less training data, and achieves real-time inference at 15 FPS on NVIDIA edge devices.
HCMar 29
WearBCI Dataset: Understanding and Benchmarking Real-World Wearable Brain-Computer Interfaces SignalsHaoxian Liu, Hengle Jiang, Lanxuan Hong et al.
Brain-computer interfaces (BCIs) have opened new platforms for human-computer interaction, medical diagnostics, and neurorehabilitation. Wearable BCI systems, which typically employ non-invasive electrodes for portable monitoring, hold great promise for real-world applications, but also face significant challenges of signal quality degradation caused by motion artifacts and environmental interferences. Most existing wearable BCI datasets are collected under stationary or controlled lab settings, limiting their utility for evaluating performance under body movement. To bridge this gap, we introduce WearBCI, the first dataset that comprehensively evaluates wearable BCI signals under different motion dynamics with synchronized multimodal recordings (EEG, IMU, and egocentric video), and systematic benchmark evaluations for studying impacts of motion artifact. Specifically, we collect data from 36 participants across different motion dynamics, including body movements, walking, and navigation. This dataset includes synchronized electroencephalography (EEG), inertial measurement unit (IMU) data, and egocentric video recordings. We analyze the collected wearable EEG signals to understand the impact of motion artifacts across different conditions, and benchmark representative EEG signal enhancement techniques on our dataset. Furthermore, we explore two new case studies: cross-modal EEG signal enhancement and multi-dimension human behavior understanding. These findings offer valuable insights into real-world wearable BCI deployment and new applications.
CVNov 13, 2025
Multivariate Gaussian Representation Learning for Medical Action EvaluationLuming Yang, Haoxian Liu, Siqing Li et al.
Fine-grained action evaluation in medical vision faces unique challenges due to the unavailability of comprehensive datasets, stringent precision requirements, and insufficient spatiotemporal dynamic modeling of very rapid actions. To support development and evaluation, we introduce CPREval-6k, a multi-view, multi-label medical action benchmark containing 6,372 expert-annotated videos with 22 clinical labels. Using this dataset, we present GaussMedAct, a multivariate Gaussian encoding framework, to advance medical motion analysis through adaptive spatiotemporal representation learning. Multivariate Gaussian Representation projects the joint motions to a temporally scaled multi-dimensional space, and decomposes actions into adaptive 3D Gaussians that serve as tokens. These tokens preserve motion semantics through anisotropic covariance modeling while maintaining robustness to spatiotemporal noise. Hybrid Spatial Encoding, employing a Cartesian and Vector dual-stream strategy, effectively utilizes skeletal information in the form of joint and bone features. The proposed method achieves 92.1% Top-1 accuracy with real-time inference on the benchmark, outperforming the ST-GCN baseline by +5.9% accuracy with only 10% FLOPs. Cross-dataset experiments confirm the superiority of our method in robustness.
SDAug 9, 2024
AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile PhonesXuanyu Liu, Haoxian Liu, Jiao Li et al.
Atrial fibrillation (AF) is characterized by irregular electrical impulses originating in the atria, which can lead to severe complications and even death. Due to the intermittent nature of the AF, early and timely monitoring of AF is critical for patients to prevent further exacerbation of the condition. Although ambulatory ECG Holter monitors provide accurate monitoring, the high cost of these devices hinders their wider adoption. Current mobile-based AF detection systems offer a portable solution. However, these systems have various applicability issues, such as being easily affected by environmental factors and requiring significant user effort. To overcome the above limitations, we present AcousAF, a novel AF detection system based on acoustic sensors of smartphones. Particularly, we explore the potential of pulse wave acquisition from the wrist using smartphone speakers and microphones. In addition, we propose a well-designed framework comprised of pulse wave probing, pulse wave extraction, and AF detection to ensure accurate and reliable AF detection. We collect data from 20 participants utilizing our custom data collection application on the smartphone. Extensive experimental results demonstrate the high performance of our system, with 92.8% accuracy, 86.9% precision, 87.4% recall, and 87.1% F1 Score.