CVAINov 7, 2020

Coarse- and Fine-grained Attention Network with Background-aware Loss for Crowd Density Map Estimation

arXiv:2011.03721v165 citations
Originality Incremental advance
AI Analysis

This addresses crowd analysis for surveillance or public safety, with incremental improvements in accuracy and quality.

The paper tackled crowd density map estimation and people counting by proposing CFANet with a coarse-to-fine attention mechanism and a background-aware loss, resulting in outperforming previous state-of-the-art methods in count accuracy and improving map quality while reducing false recognition.

In this paper, we present a novel method Coarse- and Fine-grained Attention Network (CFANet) for generating high-quality crowd density maps and people count estimation by incorporating attention maps to better focus on the crowd area. We devise a from-coarse-to-fine progressive attention mechanism by integrating Crowd Region Recognizer (CRR) and Density Level Estimator (DLE) branch, which can suppress the influence of irrelevant background and assign attention weights according to the crowd density levels, because generating accurate fine-grained attention maps directly is normally difficult. We also employ a multi-level supervision mechanism to assist the backpropagation of gradient and reduce overfitting. Besides, we propose a Background-aware Structural Loss (BSL) to reduce the false recognition ratio while improving the structural similarity to groundtruth. Extensive experiments on commonly used datasets show that our method can not only outperform previous state-of-the-art methods in terms of count accuracy but also improve the image quality of density maps as well as reduce the false recognition ratio.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes