A Multi-view RGB-D Approach for Human Pose Estimation in Operating Rooms
This work addresses the problem of accurate human pose estimation for surgical teams in operating rooms, where existing RGB methods struggle, representing an incremental improvement by integrating depth data.
The paper tackles human pose estimation in challenging operating room environments by proposing a multi-view RGB-D approach that uses depth for pose refinement, achieving joint detection and estimation without prior knowledge of person count, and demonstrates benefits on a novel dataset from live surgeries.
Many approaches have been proposed for human pose estimation in single and multi-view RGB images. However, some environments, such as the operating room, are still very challenging for state-of-the-art RGB methods. In this paper, we propose an approach for multi-view 3D human pose estimation from RGB-D images and demonstrate the benefits of using the additional depth channel for pose refinement beyond its use for the generation of improved features. The proposed method permits the joint detection and estimation of the poses without knowing a priori the number of persons present in the scene. We evaluate this approach on a novel multi-view RGB-D dataset acquired during live surgeries and annotated with ground truth 3D poses.