CVJan 11, 2024

A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting

Yashwardhan Chaudhuri, Ankit Kumar, Orchid Chetia Phukan, Arun Balaji Buduru

arXiv:2401.05968v12.04 citationsh-index: 12

Originality Incremental advance

AI Analysis

This work addresses efficiency challenges for deploying crowd-counting models in real-world applications, though it is incremental as it builds on existing backbones and fusion techniques.

The paper tackled the problem of deploying crowd-counting models on resource-constrained devices by introducing lightweight models with MobileNet and MobileViT backbones, achieving comparable results to state-of-the-art methods on datasets like ShanghaiTech-A while being the most computationally efficient.

Crowd counting finds direct applications in real-world situations, making computational efficiency and performance crucial. However, most of the previous methods rely on a heavy backbone and a complex downstream architecture that restricts the deployment. To address this challenge and enhance the versatility of crowd-counting models, we introduce two lightweight models. These models maintain the same downstream architecture while incorporating two distinct backbones: MobileNet and MobileViT. We leverage Adjacent Feature Fusion to extract diverse scale features from a Pre-Trained Model (PTM) and subsequently combine these features seamlessly. This approach empowers our models to achieve improved performance while maintaining a compact and efficient design. With the comparison of our proposed models with previously available state-of-the-art (SOTA) methods on ShanghaiTech-A ShanghaiTech-B and UCF-CC-50 dataset, it achieves comparable results while being the most computationally efficient model. Finally, we present a comparative study, an extensive ablation study, along with pruning to show the effectiveness of our models.

View on arXiv PDF

Similar