CVNov 13, 2022

Using Hand Pose Estimation To Automate Open Surgery Training Feedback

arXiv:2211.07021v211 citationsh-index: 17
Originality Synthesis-oriented
AI Analysis

This work addresses the need for efficient, remote, and markerless training feedback for surgeons, though it is incremental as it builds on existing computer vision methods applied to a new domain-specific dataset.

This research tackled the problem of automating surgical training feedback by using 2D hand pose estimation from video to model hand movements and interactions with instruments, achieving 88.35% gesture segmentation accuracy and identifying significant differences between novices and experts through novel surgical dexterity proxies.

Purpose: This research aims to facilitate the use of state-of-the-art computer vision algorithms for the automated training of surgeons and the analysis of surgical footage. By estimating 2D hand poses, we model the movement of the practitioner's hands, and their interaction with surgical instruments, to study their potential benefit for surgical training. Methods: We leverage pre-trained models on a publicly-available hands dataset to create our own in-house dataset of 100 open surgery simulation videos with 2D hand poses. We also assess the ability of pose estimations to segment surgical videos into gestures and tool-usage segments and compare them to kinematic sensors and I3D features. Furthermore, we introduce 6 novel surgical dexterity proxies stemming from domain experts' training advice, all of which our framework can automatically detect given raw video footage. Results: State-of-the-art gesture segmentation accuracy of 88.35\% on the Open Surgery Simulation dataset is achieved with the fusion of 2D poses and I3D features from multiple angles. The introduced surgical skill proxies presented significant differences for novices compared to experts and produced actionable feedback for improvement. Conclusion: This research demonstrates the benefit of pose estimations for open surgery by analyzing their effectiveness in gesture segmentation and skill assessment. Gesture segmentation using pose estimations achieved comparable results to physical sensors while being remote and markerless. Surgical dexterity proxies that rely on pose estimation proved they can be used to work towards automated training feedback. We hope our findings encourage additional collaboration on novel skill proxies to make surgical training more efficient.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes