CVOct 30, 2021

MFNet: Multi-class Few-shot Segmentation Network with Pixel-wise Metric Learning

arXiv:2111.00232v436 citations
Originality Incremental advance
AI Analysis

This work addresses the largely unexplored field of multi-class few-shot segmentation for computer vision, providing a novel architecture that could enhance applications requiring segmentation with limited data, but it appears incremental as it builds on existing few-shot segmentation concepts.

The paper tackles the problem of few-shot semantic segmentation for multiple classes, which was previously limited to single-class approaches, by proposing MFNet with a multi-class encoding-decoding architecture and pixel-wise metric learning. The method achieves clear benefits over state-of-the-art on benchmarks like PASCAL-5i and COCO-20i, though specific numerical gains are not detailed in the abstract.

In visual recognition tasks, few-shot learning requires the ability to learn object categories with few support examples. Its re-popularity in light of the deep learning development is mainly in image classification. This work focuses on few-shot semantic segmentation, which is still a largely unexplored field. A few recent advances are often restricted to single-class few-shot segmentation. In this paper, we first present a novel multi-way (class) encoding and decoding architecture which effectively fuses multi-scale query information and multi-class support information into one query-support embedding. Multi-class segmentation is directly decoded upon this embedding. For better feature fusion, a multi-level attention mechanism is proposed within the architecture, which includes the attention for support feature modulation and attention for multi-scale combination. Last, to enhance the embedding space learning, an additional pixel-wise metric learning module is introduced with triplet loss formulated on the pixel-level embedding of the input image. Extensive experiments on standard benchmarks PASCAL-5i and COCO-20i show clear benefits of our method over the state of the art in few-shot segmentation

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes