CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning
This addresses the costly and limited nature of pixel-wise labeling in segmentation, enabling models to handle new classes without extensive retraining, though it is an incremental improvement in few-shot learning.
The paper tackles the problem of few-shot semantic segmentation, where models must segment new object classes with only a few annotated examples, by proposing CANet, which achieves a mean Intersection-over-Union of 55.4% for 1-shot and 57.1% for 5-shot segmentation on PASCAL VOC 2012, outperforming prior methods by over 13%.
Recent progress in semantic segmentation is driven by deep Convolutional Neural Networks and large-scale labeled image datasets. However, data labeling for pixel-wise segmentation is tedious and costly. Moreover, a trained model can only make predictions within a set of pre-defined classes. In this paper, we present CANet, a class-agnostic segmentation network that performs few-shot segmentation on new classes with only a few annotated images available. Our network consists of a two-branch dense comparison module which performs multi-level feature comparison between the support image and the query image, and an iterative optimization module which iteratively refines the predicted results. Furthermore, we introduce an attention mechanism to effectively fuse information from multiple support examples under the setting of k-shot learning. Experiments on PASCAL VOC 2012 show that our method achieves a mean Intersection-over-Union score of 55.4% for 1-shot segmentation and 57.1% for 5-shot segmentation, outperforming state-of-the-art methods by a large margin of 14.6% and 13.2%, respectively.