CVJun 21, 2022

Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation

arXiv:2206.10075v246 citationsh-index: 58
Originality Incremental advance
AI Analysis

This addresses crowd counting in real-world scenarios with density variations, offering an incremental improvement by combining existing methods.

The paper tackles the problem of counting crowds with varying densities by proposing CTASNet, which adaptively selects CNN or Transformer branches based on density regions, achieving state-of-the-art results on four datasets.

In real-world crowd counting applications, the crowd densities in an image vary greatly. When facing density variation, humans tend to locate and count the targets in low-density regions, and reason the number in high-density regions. We observe that CNN focus on the local information correlation using a fixed-size convolution kernel and the Transformer could effectively extract the semantic crowd information by using the global self-attention mechanism. Thus, CNN could locate and estimate crowds accurately in low-density regions, while it is hard to properly perceive the densities in high-density regions. On the contrary, Transformer has a high reliability in high-density regions, but fails to locate the targets in sparse regions. Neither CNN nor Transformer can well deal with this kind of density variation. To address this problem, we propose a CNN and Transformer Adaptive Selection Network (CTASNet) which can adaptively select the appropriate counting branch for different density regions. Firstly, CTASNet generates the prediction results of CNN and Transformer. Then, considering that CNN/Transformer is appropriate for low/high-density regions, a density guided adaptive selection module is designed to automatically combine the predictions of CNN and Transformer. Moreover, to reduce the influences of annotation noise, we introduce a Correntropy based optimal transport loss. Extensive experiments on four challenging crowd counting datasets have validated the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes