CVNov 27, 2021

Head and Body: Unified Detector and Graph Network for Person Search in Media

arXiv:2111.13888v1Has Code
Originality Highly original
AI Analysis

This addresses the overlooked problem of person search in media applications like video clipping, offering a solution for cloth-changing challenges.

The paper tackles person search in media scenarios, where individuals frequently change clothes, by proposing a unified detector and graph network (UDGNet) that simultaneously detects and re-identifies human bodies and heads, improving the anchor-free model AlignPS by 12.1% in mAP.

Person search in media has seen increasing potential in Internet applications, such as video clipping and character collection. This task is common but overlooked by previous person search works which focus on surveillance scenes. The media scenarios have some different challenges from surveillance scenes. For example, a person may change his clothes frequently. To alleviate this issue, this paper proposes a Unified Detector and Graph Network (UDGNet) for person search in media. UDGNet is the first person search framework to detect and re-identify the human body and head simultaneously. Specifically, it first builds two branches based on a unified network to detect the human body and head, then the detected body and head are used for re-identification. This dual-task approach can significantly enhance discriminative learning. To tackle the cloth-changing issue, UDGNet builds two graphs to explore reliable links among cloth-changing samples and utilizes a graph network to learn better embeddings. This design effectively enhances the robustness of person search to cloth-changing challenges. Besides, we demonstrate that UDGNet can be implemented with both anchor-based and anchor-free person search frameworks and further achieve performance improvement. This paper also contributes a large-scale dataset for Person Search in Media (PSM), which provides both body and head annotations. It is by far the largest dataset for person search in media. Experiments show that UDGNet improves the anchor-free model AlignPS by 12.1% in mAP. Meanwhile, it shows good generalization across surveillance and longterm scenarios. The dataset and code will be available at: https://github.com/shuxjweb/PSM.git.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes