CVMay 25, 2020

Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd Counting

arXiv:2005.11943v123 citations
Originality Incremental advance
AI Analysis

This addresses crowd counting challenges for computer vision applications, offering an incremental improvement over multi-column approaches.

The paper tackles the problem of scale variation and density shifts in crowd counting by proposing a single-column network (ScSiNet) with interlayer and intralayer scale aggregation, which outperforms state-of-the-art methods in accuracy and transferability.

Crowd counting is an important vision task, which faces challenges on continuous scale variation within a given scene and huge density shift both within and across images. These challenges are typically addressed using multi-column structures in existing methods. However, such an approach does not provide consistent improvement and transferability due to limited ability in capturing multi-scale features, sensitiveness to large density shift, and difficulty in training multi-branch models. To overcome these limitations, a Single-column Scale-invariant Network (ScSiNet) is presented in this paper, which extracts sophisticated scale-invariant features via the combination of interlayer multi-scale integration and a novel intralayer scale-invariant transformation (SiT). Furthermore, in order to enlarge the diversity of densities, a randomly integrated loss is presented for training our single-branch method. Extensive experiments on public datasets demonstrate that the proposed method consistently outperforms state-of-the-art approaches in counting accuracy and achieves remarkable transferability and scale-invariant property.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes