CVSep 2, 2020

Multi-domain semantic segmentation with pyramidal fusion

Petra Bevandić, Marin Oršić, Ivan Grubišić, Josip Šarić, Siniša Šegvić

arXiv:2009.01636v55.88 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of applying a single model across multiple domains in semantic segmentation, though it is incremental as it builds on existing methods.

The authors tackled the problem of multi-domain semantic segmentation by adapting the SwiftNet architecture with pyramidal fusion and a custom loss, achieving first place in the Robust Vision Challenge and WildDash 2 leaderboard.

We present our submission to the semantic segmentation contest of the Robust Vision Challenge held at ECCV 2020. The contest requires submitting the same model to seven benchmarks from three different domains. Our approach is based on the SwiftNet architecture with pyramidal fusion. We address inconsistent taxonomies with a single-level 193-dimensional softmax output. We strive to train with large batches in order to stabilize optimization of a hard recognition problem, and to favour smooth evolution of batchnorm statistics. We achieve this by implementing a custom backward step through log-sum-prob loss, and by using small crops before freezing the population statistics. Our model ranks first on the RVC semantic segmentation challenge as well as on the WildDash 2 leaderboard. This suggests that pyramidal fusion is competitive not only for efficient inference with lightweight backbones, but also in large-scale setups for multi-domain application.

View on arXiv PDF

Similar