CVFeb 20, 2020

Focus on Semantic Consistency for Cross-domain Crowd Understanding

arXiv:2002.08623v153 citations
AI Analysis

This work addresses the labor-intensive data annotation issue in crowd analysis, offering an incremental improvement for domain adaptation in computer vision.

The paper tackles the problem of pixel-level crowd understanding by proposing a domain adaptation method that reduces estimation errors in background areas, achieving state-of-the-art results on three real datasets for cross-domain counting.

For pixel-level crowd understanding, it is time-consuming and laborious in data collection and annotation. Some domain adaptation algorithms try to liberate it by training models with synthetic data, and the results in some recent works have proved the feasibility. However, we found that a mass of estimation errors in the background areas impede the performance of the existing methods. In this paper, we propose a domain adaptation method to eliminate it. According to the semantic consistency, a similar distribution in deep layer's features of the synthetic and real-world crowd area, we first introduce a semantic extractor to effectively distinguish crowd and background in high-level semantic information. Besides, to further enhance the adapted model, we adopt adversarial learning to align features in the semantic space. Experiments on three representative real datasets show that the proposed domain adaptation scheme achieves the state-of-the-art for cross-domain counting problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes