CVSep 16, 2020

Red Carpet to Fight Club: Partially-supervised Domain Transfer for Face Recognition in Violent Videos

arXiv:2009.07576v1
AI Analysis

This work addresses a domain-transfer problem for face recognition in criminal investigations, but it is incremental as it builds on existing techniques for cross-domain adaptation.

The paper tackles the problem of face recognition in violent videos by transferring models trained on clean portrait-like photos, addressing challenges like domain discrepancy and limited training data. It introduces the WildestFaces dataset and proposes methods including stacked affine-transforms and attention-driven pooling, achieving improved recognition in adverse conditions.

In many real-world problems, there is typically a large discrepancy between the characteristics of data used in training versus deployment. A prime example is the analysis of aggression videos: in a criminal incidence, typically suspects need to be identified based on their clean portrait-like photos, instead of their prior video recordings. This results in three major challenges; large domain discrepancy between violence videos and ID-photos, the lack of video examples for most individuals and limited training data availability. To mimic such scenarios, we formulate a realistic domain-transfer problem, where the goal is to transfer the recognition model trained on clean posed images to the target domain of violent videos, where training videos are available only for a subset of subjects. To this end, we introduce the WildestFaces dataset, tailored to study cross-domain recognition under a variety of adverse conditions. We divide the task of transferring a recognition model from the domain of clean images to the violent videos into two sub-problems and tackle them using (i) stacked affine-transforms for classifier-transfer, (ii) attention-driven pooling for temporal-adaptation. We additionally formulate a self-attention based model for domain-transfer. We establish a rigorous evaluation protocol for this clean-to-violent recognition task, and present a detailed analysis of the proposed dataset and the methods. Our experiments highlight the unique challenges introduced by the WildestFaces dataset and the advantages of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes