CVAIOct 15, 2025

Real-Time Crowd Counting for Embedded Systems with Lightweight Architecture

arXiv:2510.13250v11 citationsh-index: 12Vicinagearth
Originality Incremental advance
AI Analysis

This addresses the need for efficient crowd counting in intelligent security and public safety on resource-constrained embedded devices, representing an incremental improvement in speed optimization.

The paper tackles the problem of real-time crowd counting on embedded systems by designing a lightweight architecture with a stem-encoder-decoder structure, achieving the fastest inference speeds of 381.7 FPS on GTX 1080Ti and 71.9 FPS on Jetson TX1 while maintaining competitive accuracy.

Crowd counting is a task of estimating the number of the crowd through images, which is extremely valuable in the fields of intelligent security, urban planning, public safety management, and so on. However, the existing counting methods have some problems in practical application on embedded systems for these fields, such as excessive model parameters, abundant complex calculations, etc. The practical application of embedded systems requires the model to be real-time, which means that the model is fast enough. Considering the aforementioned problems, we design a super real-time model with a stem-encoder-decoder structure for crowd counting tasks, which achieves the fastest inference compared with state-of-the-arts. Firstly, large convolution kernels in the stem network are used to enlarge the receptive field, which effectively extracts detailed head information. Then, in the encoder part, we use conditional channel weighting and multi-branch local fusion block to merge multi-scale features with low computational consumption. This part is crucial to the super real-time performance of the model. Finally, the feature pyramid networks are added to the top of the encoder to alleviate its incomplete fusion problems. Experiments on three benchmarks show that our network is suitable for super real-time crowd counting on embedded systems, ensuring competitive accuracy. At the same time, the proposed network reasoning speed is the fastest. Specifically, the proposed network achieves 381.7 FPS on NVIDIA GTX 1080Ti and 71.9 FPS on NVIDIA Jetson TX1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes