Human De-occlusion: Invisible Perception and Recovery for Humans
This addresses the challenge of perceiving and recovering occluded humans in images, which is important for applications like surveillance and robotics, but is incremental as it builds on existing instance segmentation and amodal perception methods.
The paper tackles the problem of human de-occlusion by proposing a two-stage framework to estimate occluded segmentation masks and recover invisible appearance content, achieving state-of-the-art performance in both tasks on a new dataset.
In this paper, we tackle the problem of human de-occlusion which reasons about occluded segmentation masks and invisible appearance content of humans. In particular, a two-stage framework is proposed to estimate the invisible portions and recover the content inside. For the stage of mask completion, a stacked network structure is devised to refine inaccurate masks from a general instance segmentation model and predict integrated masks simultaneously. Additionally, the guidance from human parsing and typical pose masks are leveraged to bring prior information. For the stage of content recovery, a novel parsing guided attention module is applied to isolate body parts and capture context information across multiple scales. Besides, an Amodal Human Perception dataset (AHP) is collected to settle the task of human de-occlusion. AHP has advantages of providing annotations from real-world scenes and the number of humans is comparatively larger than other amodal perception datasets. Based on this dataset, experiments demonstrate that our method performs over the state-of-the-art techniques in both tasks of mask completion and content recovery. Our AHP dataset is available at \url{https://sydney0zq.github.io/ahp/}.