CVJan 6, 2018

Domain-Specific Face Synthesis for Video Face Recognition from a Single Sample Per Person

arXiv:1801.01974v240 citations
AI Analysis

This work addresses domain shift in face recognition for video surveillance applications, offering an incremental improvement over existing methods.

The paper tackles the problem of face recognition performance decline in still-to-video systems due to domain shift, particularly with single reference samples, by introducing a domain-specific face synthesis algorithm that generates synthetic faces to augment the reference set. Experimental results show this approach provides higher accuracy compared to state-of-the-art methods with moderate computational increase.

The performance of still-to-video FR systems can decline significantly because faces captured in unconstrained operational domain (OD) over multiple video cameras have a different underlying data distribution compared to faces captured under controlled conditions in the enrollment domain (ED) with a still camera. This is particularly true when individuals are enrolled to the system using a single reference still. To improve the robustness of these systems, it is possible to augment the reference set by generating synthetic faces based on the original still. However, without knowledge of the OD, many synthetic images must be generated to account for all possible capture conditions. FR systems may, therefore, require complex implementations and yield lower accuracy when training on many less relevant images. This paper introduces an algorithm for domain-specific face synthesis (DSFS) that exploits the representative intra-class variation information available from the OD. Prior to operation, a compact set of faces from unknown persons appearing in the OD is selected through clustering in the captured condition space. The domain-specific variations of these face images are projected onto the reference stills by integrating an image-based face relighting technique inside the 3D reconstruction framework. A compact set of synthetic faces is generated that resemble individuals of interest under the capture conditions relevant to the OD. In a particular implementation based on sparse representation classification, the synthetic faces generated with the DSFS are employed to form a cross-domain dictionary that account for structured sparsity. Experimental results reveal that augmenting the reference gallery set of FR systems using the proposed DSFS approach can provide a higher level of accuracy compared to state-of-the-art approaches, with only a moderate increase in its computational complexity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes