CVDec 28, 2025

PoseStreamer: A Multi-modal Framework for 3D Tracking of Unseen Moving Objects

arXiv:2512.22979v3h-index: 1
Originality Incremental advance
AI Analysis

This work addresses a critical challenge in computer vision for robotics and AR/VR applications, offering a robust solution for high-speed object tracking with event cameras, though it appears incremental in its method integration.

The paper tackles the problem of 6DoF pose estimation for unseen moving objects in high-speed and low-light scenarios, proposing PoseStreamer, a multi-modal framework that integrates adaptive memory, 2D tracking, and geometric refinement, and achieves superior accuracy and strong generalizability as demonstrated through extensive experiments.

Six degree of freedom (6DoF) pose estimation for novel objects is a critical task in computer vision, yet it faces significant challenges in high-speed and low-light scenarios where standard RGB cameras suffer from motion blur. While event cameras offer a promising solution due to their high temporal resolution, current 6DoF pose estimation methods typically yield suboptimal performance in high-speed object moving scenarios. To address this gap, we propose PoseStreamer, a robust multi-modal 6DoF pose estimation framework designed specifically on high-speed moving scenarios. Our approach integrates three core components: an Adaptive Pose Memory Queue that utilizes historical orientation cues for temporal consistency, an Object-centric 2D Tracker that provides strong 2D priors to boost 3D center recall, and a Ray Pose Filter for geometric refinement along camera rays. Furthermore, we introduce MoCapCube6D, a novel multi-modal dataset constructed to benchmark performance under rapid motion. Extensive experiments demonstrate that PoseStreamer not only achieves superior accuracy in high-speed moving scenarios, but also exhibits strong generalizability as a template-free framework for unseen moving objects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes