CVAug 10, 2023

Category Feature Transformer for Semantic Segmentation

Quan Tang, Chuanjian Liu, Fagui Liu, Yifan Liu, Jun Jiang, Bowen Zhang, Kai Han, Yunhe Wang

arXiv:2308.05581v15.05 citationsh-index: 54Has Code

Originality Incremental advance

AI Analysis

This work addresses feature aggregation for semantic segmentation, offering a novel method that improves performance and efficiency, though it is incremental as it builds on existing multi-head attention and feature pyramid structures.

The paper tackles the problem of feature aggregation in semantic segmentation by proposing the Category Feature Transformer (CFT), which uses multi-head attention to learn and broadcast category embeddings, achieving a compelling 55.1% mIoU on the ADE20K dataset with reduced parameters and computations.

Aggregation of multi-stage features has been revealed to play a significant role in semantic segmentation. Unlike previous methods employing point-wise summation or concatenation for feature aggregation, this study proposes the Category Feature Transformer (CFT) that explores the flow of category embedding and transformation among multi-stage features through the prevalent multi-head attention mechanism. CFT learns unified feature embeddings for individual semantic categories from high-level features during each aggregation process and dynamically broadcasts them to high-resolution features. Integrating the proposed CFT into a typical feature pyramid structure exhibits superior performance over a broad range of backbone networks. We conduct extensive experiments on popular semantic segmentation benchmarks. Specifically, the proposed CFT obtains a compelling 55.1% mIoU with greatly reduced model parameters and computations on the challenging ADE20K dataset.

View on arXiv PDF Code

Similar