CVDec 22, 2022

Confidence-Aware Paced-Curriculum Learning by Label Smoothing for Surgical Scene Understanding

arXiv:2212.11511v13 citationsh-index: 61
Originality Incremental advance
AI Analysis

This work addresses surgical scene understanding for robotic vision, but it is incremental as it builds on existing curriculum learning and label smoothing techniques.

The authors tackled the problem of improving surgical scene understanding by introducing a paced curriculum learning method based on label smoothing, which gradually reduces smoothing to control learning utility. The result showed improved prediction accuracy and robustness across four robotic surgery datasets for tasks like classification and segmentation.

Curriculum learning and self-paced learning are the training strategies that gradually feed the samples from easy to more complex. They have captivated increasing attention due to their excellent performance in robotic vision. Most recent works focus on designing curricula based on difficulty levels in input samples or smoothing the feature maps. However, smoothing labels to control the learning utility in a curriculum manner is still unexplored. In this work, we design a paced curriculum by label smoothing (P-CBLS) using paced learning with uniform label smoothing (ULS) for classification tasks and fuse uniform and spatially varying label smoothing (SVLS) for semantic segmentation tasks in a curriculum manner. In ULS and SVLS, a bigger smoothing factor value enforces a heavy smoothing penalty in the true label and limits learning less information. Therefore, we design the curriculum by label smoothing (CBLS). We set a bigger smoothing value at the beginning of training and gradually decreased it to zero to control the model learning utility from lower to higher. We also designed a confidence-aware pacing function and combined it with our CBLS to investigate the benefits of various curricula. The proposed techniques are validated on four robotic surgery datasets of multi-class, multi-label classification, captioning, and segmentation tasks. We also investigate the robustness of our method by corrupting validation data into different severity levels. Our extensive analysis shows that the proposed method improves prediction accuracy and robustness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes