CVMay 19, 2018

Long-term face tracking in the wild using deep learning

arXiv:1805.07646v16 citations
Originality Incremental advance
AI Analysis

This addresses the problem of robust face tracking in dynamic, real-world environments for applications like surveillance or video analysis, but it is incremental as it builds on existing deep learning and tracking techniques.

The paper tackles long-term face tracking of a specific person in unconstrained video streams using a query image, developing a detection-verification-tracking system that outperforms existing methods like TLD and face-TLD in recall and precision on sitcom and TV show tests.

This paper investigates long-term face tracking of a specific person given his/her face image in a single frame as a query in a video stream. Through taking advantage of pre-trained deep learning models on big data, a novel system is developed for accurate video face tracking in the unconstrained environments depicting various people and objects moving in and out of the frame. In the proposed system, we present a detection-verification-tracking method (dubbed as 'DVT') which accomplishes the long-term face tracking task through the collaboration of face detection, face verification, and (short-term) face tracking. An offline trained detector based on cascaded convolutional neural networks localizes all faces appeared in the frames, and an offline trained face verifier based on deep convolutional neural networks and similarity metric learning decides if any face or which face corresponds to the queried person. An online trained tracker follows the face from frame to frame. When validated on a sitcom episode and a TV show, the DVT method outperforms tracking-learning-detection (TLD) and face-TLD in terms of recall and precision. The proposed system is also tested on many other types of videos and shows very promising results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes