CVLGNov 25, 2019

Estimating People Flows to Better Count Them in Crowded Scenes

arXiv:1911.10782v441 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate crowd counting in videos for surveillance and public safety applications, representing a significant but incremental improvement over existing methods.

The paper tackles the problem of counting people in crowded scenes by estimating people flows between consecutive frames instead of directly regressing densities, which enables stronger temporal constraints and correlation with optical flow. This approach consistently outperforms state-of-the-art methods on five benchmark datasets.

Modern methods for counting people in crowded scenes rely on deep networks to estimate people densities in individual images. As such, only very few take advantage of temporal consistency in video sequences, and those that do only impose weak smoothness constraints across consecutive frames. In this paper, we advocate estimating people flows across image locations between consecutive images and inferring the people densities from these flows instead of directly regressing. This enables us to impose much stronger constraints encoding the conservation of the number of people. As a result, it significantly boosts performance without requiring a more complex architecture. Furthermore, it also enables us to exploit the correlation between people flow and optical flow to further improve the results. We will demonstrate that we consistently outperform state-of-the-art methods on five benchmark datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes