The Semantic Mutex Watershed for Efficient Bottom-Up Semantic Instance Segmentation
This addresses the computational bottleneck in joint segmentation and labeling for applications like urban scene analysis and 3D microscopy, offering an incremental improvement over existing methods.
The authors tackled the problem of joint semantic instance segmentation by proposing a greedy algorithm derived from the Mutex Watershed, which efficiently scales and operates directly on pixels without superpixels. The result shows that on the Cityscapes dataset, it outperforms Panoptic Feature Pyramid Networks, and in 3D microscopy, it beats separate optimization approaches.
Semantic instance segmentation is the task of simultaneously partitioning an image into distinct segments while associating each pixel with a class label. In commonly used pipelines, segmentation and label assignment are solved separately since joint optimization is computationally expensive. We propose a greedy algorithm for joint graph partitioning and labeling derived from the efficient Mutex Watershed partitioning algorithm. It optimizes an objective function closely related to the Symmetric Multiway Cut objective and empirically shows efficient scaling behavior. Due to the algorithm's efficiency it can operate directly on pixels without prior over-segmentation of the image into superpixels. We evaluate the performance on the Cityscapes dataset (2D urban scenes) and on a 3D microscopy volume. In urban scenes, the proposed algorithm combined with current deep neural networks outperforms the strong baseline of `Panoptic Feature Pyramid Networks' by Kirillov et al. (2019). In the 3D electron microscopy images, we show explicitly that our joint formulation outperforms a separate optimization of the partitioning and labeling problems.