CVMar 15, 2025

EHNet: An Efficient Hybrid Network for Crowd Counting and Localization

arXiv:2503.12061v12 citationsh-index: 1Has CodePoster Volume Ⅰ The 2025 Twenty-First International Conference on Intelligent Computing July 26-29, 2025 Ningbo, China

Originality Incremental advance

AI Analysis

This work addresses crowd counting and localization for computer vision applications, presenting an incremental improvement over existing methods.

The authors tackled the challenge of multi-scale crowd distributions in crowd counting and localization by introducing EHNet, which achieved competitive performance on four benchmark datasets with reduced computational overhead.

In recent years, crowd counting and localization have become crucial techniques in computer vision, with applications spanning various domains. The presence of multi-scale crowd distributions within a single image remains a fundamental challenge in crowd counting tasks. To address these challenges, we introduce the Efficient Hybrid Network (EHNet), a novel framework for efficient crowd counting and localization. By reformulating crowd counting into a point regression framework, EHNet leverages the Spatial-Position Attention Module (SPAM) to capture comprehensive spatial contexts and long-range dependencies. Additionally, we develop an Adaptive Feature Aggregation Module (AFAM) to effectively fuse and harmonize multi-scale feature representations. Building upon these, we introduce the Multi-Scale Attentive Decoder (MSAD). Experimental results on four benchmark datasets demonstrate that EHNet achieves competitive performance with reduced computational overhead, outperforming existing methods on ShanghaiTech Part \_A, ShanghaiTech Part \_B, UCF-CC-50, and UCF-QNRF. Our code is in https://anonymous.4open.science/r/EHNet.

View on arXiv PDF

Similar