Geometry OR Tracker: Universal Geometric Operating Room Tracking
This addresses the challenge of stable multi-view 3D tracking in clinical settings for applications like surgeon behavior recognition, though it is incremental as it builds on existing tracking methods with a novel rectification module.
The paper tackles the problem of unreliable camera calibration and RGB-D registration in operating rooms, which causes geometric inconsistency and degrades 3D tracking; it introduces Geometry OR Tracker, a two-stage pipeline that reduces cross-view depth disagreement by more than 30× and improves world-frame tracking accuracy.
In operating rooms (OR), world-scale multi-view 3D tracking supports downstream applications such as surgeon behavior recognition, where physically meaningful quantities such as distances and motion statistics must be measured in meters. However, real clinical deployments rarely satisfy the geometric prerequisites for stable multi-view fusion and tracking: camera calibration and RGB-D registration are always unreliable, leading to cross-view geometric inconsistency that produces "ghosting" during fusion and degrades 3D trajectories in a shared OR coordinate frame. To address this, we introduce Geometry OR Tracker, a two-stage pipeline that first rectifies imprecise calibration into a scaleconsistent and geometrically consistent camera setup with a single global scale via a Multi-view Metric Geometry Rectification module, and then performs Occlusion-Robust 3D Point Tracking directly in the unified OR world frame. On the MM-OR benchmark, improved geometric consistency translates into tracking gains: our rectification front-end reduces cross-view depth disagreement by more than 30$\times$ compared to raw calibration. Ablation studies further demonstrate the relationship between calibration quality and tracking accuracy, showing that improved geometric consistency yields stronger world-frame tracking.