CVFeb 2, 2024

Scale Equalization for Multi-Level Feature Fusion

arXiv:2402.01149v11 citationsh-index: 8Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This addresses a performance degradation issue in semantic segmentation for computer vision applications, but it is incremental as it builds on existing multi-level fusion methods.

The paper tackles the problem of scale disequilibrium in multi-level feature fusion for semantic segmentation, caused by bilinear upsampling, and proposes scale equalizers that improve mIoU across datasets like ADE20K, PASCAL VOC 2012, and Cityscapes.

Deep neural networks have exhibited remarkable performance in a variety of computer vision fields, especially in semantic segmentation tasks. Their success is often attributed to multi-level feature fusion, which enables them to understand both global and local information from an image. However, we found that multi-level features from parallel branches are on different scales. The scale disequilibrium is a universal and unwanted flaw that leads to detrimental gradient descent, thereby degrading performance in semantic segmentation. We discover that scale disequilibrium is caused by bilinear upsampling, which is supported by both theoretical and empirical evidence. Based on this observation, we propose injecting scale equalizers to achieve scale equilibrium across multi-level features after bilinear upsampling. Our proposed scale equalizers are easy to implement, applicable to any architecture, hyperparameter-free, implementable without requiring extra computational cost, and guarantee scale equilibrium for any dataset. Experiments showed that adopting scale equalizers consistently improved the mIoU index across various target datasets, including ADE20K, PASCAL VOC 2012, and Cityscapes, as well as various decoder choices, including UPerHead, PSPHead, ASPPHead, SepASPPHead, and FCNHead.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes