Better (pseudo-)labels for semi-supervised instance segmentation
This work addresses the challenge of limited labeled data and skewed class distributions in instance segmentation, offering improvements for applications requiring fine-grained object recognition, though it is incremental in nature.
The paper tackled the problem of miscalibration and inefficiency in semi-supervised teacher-student methods for instance segmentation, particularly for rare classes, resulting in a 2.8% increase in average precision overall and a 10.3% gain for rare classes on the LVIS dataset.
Despite the availability of large datasets for tasks like image classification and image-text alignment, labeled data for more complex recognition tasks, such as detection and segmentation, is less abundant. In particular, for instance segmentation annotations are time-consuming to produce, and the distribution of instances is often highly skewed across classes. While semi-supervised teacher-student distillation methods show promise in leveraging vast amounts of unlabeled data, they suffer from miscalibration, resulting in overconfidence in frequently represented classes and underconfidence in rarer ones. Additionally, these methods encounter difficulties in efficiently learning from a limited set of examples. We introduce a dual-strategy to enhance the teacher model's training process, substantially improving the performance on few-shot learning. Secondly, we propose a calibration correction mechanism that that enables the student model to correct the teacher's calibration errors. Using our approach, we observed marked improvements over a state-of-the-art supervised baseline performance on the LVIS dataset, with an increase of 2.8% in average precision (AP) and 10.3% gain in AP for rare classes.