CVMar 17, 2021

Multi-channel Deep Supervision for Crowd Counting

arXiv:2103.09553v11.4

Originality Incremental advance

AI Analysis

This work addresses accuracy issues in crowd counting for applications like public safety, but it is incremental as it builds on existing CNN-based methods.

The paper tackles overfitting and detail loss in crowd counting by proposing MDSNet, which uses Multi-channel Deep Supervision and an auxiliary SupervisionNet to generate supervision maps, achieving competitive results on mainstream benchmarks.

Crowd counting is a task worth exploring in modern society because of its wide applications such as public safety and video monitoring. Many CNN-based approaches have been proposed to improve the accuracy of estimation, but there are some inherent issues affect the performance, such as overfitting and details lost caused by pooling layers. To tackle these problems, in this paper, we propose an effective network called MDSNet, which introduces a novel supervision framework called Multi-channel Deep Supervision (MDS). The MDS conducts channel-wise supervision on the decoder of the estimation model to help generate the density maps. To obtain the accurate supervision information of different channels, the MDSNet employs an auxiliary network called SupervisionNet (SN) to generate abundant supervision maps based on existing groundtruth. Besides the traditional density map supervision, we also use the SN to convert the dot annotations into continuous supervision information and conduct dot supervision in the MDSNet. Extensive experiments on several mainstream benchmarks show that the proposed MDSNet achieves competitive results and the MDS significantly improves the performance without changing the network structure.

View on arXiv PDF

Similar