CVApr 7, 2021

V2F-Net: Explicit Decomposition of Occluded Pedestrian Detection

Mingyang Shang, Dawei Xiang, Zhicheng Wang, Erjin Zhou

arXiv:2104.03106v12.613 citations

Originality Incremental advance

AI Analysis

This addresses occlusion challenges in pedestrian detection for applications like autonomous driving, but it is incremental as it builds on existing detection frameworks.

The paper tackles occluded pedestrian detection by proposing V2F-Net, which decomposes the task into visible region detection and full-body estimation, achieving a 5.85% AP gain on CrowdHuman and a 2.24% MR-2 improvement on CityPersons compared to a baseline.

Occlusion is very challenging in pedestrian detection. In this paper, we propose a simple yet effective method named V2F-Net, which explicitly decomposes occluded pedestrian detection into visible region detection and full body estimation. V2F-Net consists of two sub-networks: Visible region Detection Network (VDN) and Full body Estimation Network (FEN). VDN tries to localize visible regions and FEN estimates full-body box on the basis of the visible box. Moreover, to further improve the estimation of full body, we propose a novel Embedding-based Part-aware Module (EPM). By supervising the visibility for each part, the network is encouraged to extract features with essential part information. We experimentally show the effectiveness of V2F-Net by conducting several experiments on two challenging datasets. V2F-Net achieves 5.85% AP gains on CrowdHuman and 2.24% MR-2 improvements on CityPersons compared to FPN baseline. Besides, the consistent gain on both one-stage and two-stage detector validates the generalizability of our method.

View on arXiv PDF

Similar