CVMar 18

OnlineHMR: Video-based Online World-Grounded Human Mesh Recovery

arXiv:2603.1735559.12 citationsh-index: 2
AI Analysis

This work addresses the need for real-time, interactive applications like AR/VR and telepresence by enabling online processing, though it is incremental as it builds on existing HMR methods.

The paper tackles the problem of offline limitations in human mesh recovery from monocular videos by proposing OnlineHMR, a fully online framework that achieves performance comparable to existing chunk-based methods on the EMDB benchmark and dynamic videos.

Human mesh recovery (HMR) models 3D human body from monocular videos, with recent works extending it to world-coordinate human trajectory and motion reconstruction. However, most existing methods remain offline, relying on future frames or global optimization, which limits their applicability in interactive feedback and perception-action loop scenarios such as AR/VR and telepresence. To address this, we propose OnlineHMR, a fully online framework that jointly satisfies four essential criteria of online processing, including system-level causality, faithfulness, temporal consistency, and efficiency. Built upon a two-branch architecture, OnlineHMR enables streaming inference via a causal key-value cache design and a curated sliding-window learning strategy. Meanwhile, a human-centric incremental SLAM provides online world-grounded alignment under physically plausible trajectory correction. Experimental results show that our method achieves performance comparable to existing chunk-based approaches on the standard EMDB benchmark and highly dynamic custom videos, while uniquely supporting online processing. Page and code are available at https://tsukasane.github.io/Video-OnlineHMR/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes