Segmentation of Surgical Instruments for Minimally-Invasive Robot-Assisted Procedures Using Generative Deep Neural Networks
This work addresses the problem of reducing manual labeling effort for surgical instrument segmentation in medical robotics, but it is incremental as it builds on existing domain adaptation and segmentation methods.
The paper tackles semantic segmentation of surgical instruments in minimally invasive robot-assisted procedures by using CycleGAN for domain adaptation to augment training data, reducing manual labeling needs and improving segmentation performance, though it notes limited generalization to instruments with different shapes.
This work proves that semantic segmentation on minimally invasive surgical instruments can be improved by using training data that has been augmented through domain adaptation. The benefit of this method is twofold. Firstly, it suppresses the need of manually labeling thousands of images by transforming synthetic data into realistic-looking data. To achieve this, a CycleGAN model is used, which transforms a source dataset to approximate the domain distribution of a target dataset. Secondly, this newly generated data with perfect labels is utilized to train a semantic segmentation neural network, U-Net. This method shows generalization capabilities on data with variability regarding its rotation- position- and lighting conditions. Nevertheless, one of the caveats of this approach is that the model is unable to generalize well to other surgical instruments with a different shape from the one used for training. This is driven by the lack of a high variance in the geometric distribution of the training data. Future work will focus on making the model more scale-invariant and able to adapt to other types of surgical instruments previously unseen by the training.