BLADE: Box-Level Supervised Amodal Segmentation through Directed Expansion
This work addresses the challenge of generating high-resolution and precise amodal masks for occluded objects in computer vision, which is essential for applications like autonomous driving and robotics, but it is incremental as it builds on existing box-level supervised methods.
The paper tackles the problem of amodal segmentation for occluded objects by proposing a directed expansion approach from visible masks to amodal masks, using only box-level supervision to avoid costly pixel-level annotations. The results show that the method outperforms existing state-of-the-art methods with large margins on several challenging datasets.
Perceiving the complete shape of occluded objects is essential for human and machine intelligence. While the amodal segmentation task is to predict the complete mask of partially occluded objects, it is time-consuming and labor-intensive to annotate the pixel-level ground truth amodal masks. Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision, thereby alleviating the need for exhaustive pixel-level annotations. Nevertheless, current box-level methodologies encounter limitations in generating low-resolution masks and imprecise boundaries, failing to meet the demands of practical real-world applications. We present a novel solution to tackle this problem by introducing a directed expansion approach from visible masks to corresponding amodal masks. Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect. Diverse segmentation strategies are applied for overlapping regions and non-overlapping regions according to distinct characteristics. To guide the expansion of visible masks, we introduce an elaborately-designed connectivity loss for overlapping regions, which leverages correlations with visible masks and facilitates accurate amodal segmentation. Experiments are conducted on several challenging datasets and the results show that our proposed method can outperform existing state-of-the-art methods with large margins.