EffSeg: Efficient Fine-Grained Instance Segmentation using Structure-Preserving Sparsity
This work addresses efficiency bottlenecks in instance segmentation for computer vision applications, representing an incremental improvement over existing methods.
The paper tackles the problem of inefficient fine-grained instance segmentation by proposing EffSeg with Structure-Preserving Sparsity, achieving similar performance to RefineMask on COCO while reducing FLOPs by 71% and increasing FPS by 29%.
Many two-stage instance segmentation heads predict a coarse 28x28 mask per instance, which is insufficient to capture the fine-grained details of many objects. To address this issue, PointRend and RefineMask predict a 112x112 segmentation mask resulting in higher quality segmentations. Both methods however have limitations by either not having access to neighboring features (PointRend) or by performing computation at all spatial locations instead of sparsely (RefineMask). In this work, we propose EffSeg performing fine-grained instance segmentation in an efficient way by using our Structure-Preserving Sparsity (SPS) method based on separately storing the active features, the passive features and a dense 2D index map containing the feature indices. The goal of the index map is to preserve the 2D spatial configuration or structure between the features such that any 2D operation can still be performed. EffSeg achieves similar performance on COCO compared to RefineMask, while reducing the number of FLOPs by 71% and increasing the FPS by 29%. Code will be released.