ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding
This work addresses crowd and vehicle counting in noisy, congested scenes, offering a novel method that improves accuracy over existing approaches, though it is incremental in nature.
The paper tackles the problem of accuracy degradation in crowd counting for highly congested noisy scenes by proposing ADCrowdNet, which uses an attention-aware network and multi-scale deformable convolutions to generate high-quality density maps, achieving state-of-the-art results on five datasets including crowd and vehicle counting benchmarks.
We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes. ADCrowdNet contains two concatenated networks. An attention-aware network called Attention Map Generator (AMG) first detects crowd regions in images and computes the congestion degree of these regions. Based on detected crowd regions and congestion priors, a multi-scale deformable network called Density Map Estimator (DME) then generates high-quality density maps. With the attention-aware training scheme and multi-scale deformable convolutional scheme, the proposed ADCrowdNet achieves the capability of being more effective to capture the crowd features and more resistant to various noises. We have evaluated our method on four popular crowd counting datasets (ShanghaiTech, UCF_CC_50, WorldEXPO'10, and UCSD) and an extra vehicle counting dataset TRANCOS, and our approach beats existing state-of-the-art approaches on all of these datasets.