PMC-GANs: Generating Multi-Scale High-Quality Pedestrian with Multimodal Cascaded GANs
This work addresses data augmentation needs for pedestrian detection, offering a domain-specific incremental improvement.
The paper tackles the problem of generating realistic and diversified pedestrian images for data augmentation by proposing PMC-GANs, a multimodal cascaded GAN model that uses a residual U-net with multi-scale and attention blocks, resulting in improved pedestrian detection performance.
Recently, generative adversarial networks (GANs) have shown great advantages in synthesizing images, leading to a boost of explorations of using faked images to augment data. This paper proposes a multimodal cascaded generative adversarial networks (PMC-GANs) to generate realistic and diversified pedestrian images and augment pedestrian detection data. The generator of our model applies a residual U-net structure, with multi-scale residual blocks to encode features, and attention residual blocks to help decode and rebuild pedestrian images. The model constructs in a coarse-to-fine fashion and adopts cascade structure, which is beneficial to produce high-resolution pedestrians. PMC-GANs outperforms baselines, and when used for data augmentation, it improves pedestrian detection results.