Memory Based Online Learning of Deep Representations from Video Streams
This addresses the problem of real-time face recognition and tracking in unconstrained videos for applications like surveillance or human-computer interaction, representing an incremental improvement over existing offline approaches.
The paper tackles online unsupervised face identity learning from video streams by combining deep face descriptors with a memory-based mechanism that leverages temporal coherence, achieving comparable results in multiple face tracking and better performance in face identification compared to offline methods that use future information.
We present a novel online unsupervised method for face identity learning from video streams. The method exploits deep face descriptors together with a memory based learning mechanism that takes advantage of the temporal coherence of visual data. Specifically, we introduce a discriminative feature matching solution based on Reverse Nearest Neighbour and a feature forgetting strategy that detect redundant features and discard them appropriately while time progresses. It is shown that the proposed learning procedure is asymptotically stable and can be effectively used in relevant applications like multiple face identification and tracking from unconstrained video streams. Experimental results show that the proposed method achieves comparable results in the task of multiple face tracking and better performance in face identification with offline approaches exploiting future information. Code will be publicly available.