CVApr 30, 2018

CrowdHuman: A Benchmark for Detecting Human in a Crowd

Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun

arXiv:1805.00123v135.7849 citations

Originality Synthesis-oriented

AI Analysis

This provides a benchmark for researchers working on human detection in crowded scenes, though it is incremental as it focuses on dataset creation rather than a new detection method.

The authors tackled the problem of human detection in crowded environments by introducing the CrowdHuman dataset, which contains 470K human instances with an average of 22.6 persons per image and various occlusions, and demonstrated state-of-the-art cross-dataset generalization on benchmarks like Caltech-USA and CityPersons.

Human detection has witnessed impressive progress in recent years. However, the occlusion issue of detecting human in highly crowded environments is far from solved. To make matters worse, crowd scenarios are still under-represented in current human detection benchmarks. In this paper, we introduce a new dataset, called CrowdHuman, to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. There are a total of $470K$ human instances from the train and validation subsets, and $~22.6$ persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. Baseline performance of state-of-the-art detection frameworks on CrowdHuman is presented. The cross-dataset generalization results of CrowdHuman dataset demonstrate state-of-the-art performance on previous dataset including Caltech-USA, CityPersons, and Brainwash without bells and whistles. We hope our dataset will serve as a solid baseline and help promote future research in human detection tasks.

View on arXiv PDF

Similar