Improved Few-shot Segmentation by Redefinition of the Roles of Multi-level CNN Features
This work addresses the problem of segmenting unseen object classes with limited examples for computer vision applications, representing an incremental improvement over existing methods.
The paper tackles few-shot segmentation by swapping the primary and secondary roles of mid-level and high-level CNN features, leading to iterative updates that improve performance, achieving new state-of-the-art results on COCO-20^i (1-shot and 5-shot) and PASCAL-5^i (1-shot).
This study is concerned with few-shot segmentation, i.e., segmenting the region of an unseen object class in a query image, given support image(s) of its instances. The current methods rely on the pretrained CNN features of the support and query images. The key to good performance depends on the proper fusion of their mid-level and high-level features; the former contains shape-oriented information, while the latter has class-oriented information. Current state-of-the-art methods follow the approach of Tian et al., which gives the mid-level features the primary role and the high-level features the secondary role. In this paper, we reinterpret this widely employed approach by redifining the roles of the multi-level features; we swap the primary and secondary roles. Specifically, we regard that the current methods improve the initial estimate generated from the high-level features using the mid-level features. This reinterpretation suggests a new application of the current methods: to apply the same network multiple times to iteratively update the estimate of the object's region, starting from its initial estimate. Our experiments show that this method is effective and has updated the previous state-of-the-art on COCO-20$^i$ in the 1-shot and 5-shot settings and on PASCAL-5$^i$ in the 1-shot setting.