CVSep 19, 2020

Recognizing Micro-Expression in Video Clip with Adaptive Key-Frame Mining

Min Peng, Chongyang Wang, Yuan Gao, Tao Bi, Tong Chen, Yu Shi, Xiang-Dong Zhou

arXiv:2009.09179v33.37 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of accurately detecting fleeting facial movements for applications in psychology and security, representing an incremental improvement over existing deep learning techniques.

The paper tackled the problem of recognizing micro-expressions in video clips by addressing redundancy in full-video representations and the need for expert annotations in single apex frame methods, proposing an adaptive key-frame mining network (AKMNet) that improved recognition accuracy compared to state-of-the-art methods on multiple benchmark datasets.

As a spontaneous expression of emotion on face, micro-expression reveals the underlying emotion that cannot be controlled by human. In micro-expression, facial movement is transient and sparsely localized through time. However, the existing representation based on various deep learning techniques learned from a full video clip is usually redundant. In addition, methods utilizing the single apex frame of each video clip require expert annotations and sacrifice the temporal dynamics. To simultaneously localize and recognize such fleeting facial movements, we propose a novel end-to-end deep learning architecture, referred to as adaptive key-frame mining network (AKMNet). Operating on the video clip of micro-expression, AKMNet is able to learn discriminative spatio-temporal representation by combining spatial features of self-learned local key frames and their global-temporal dynamics. Theoretical analysis and empirical evaluation show that the proposed approach improved recognition accuracy in comparison with state-of-the-art methods on multiple benchmark datasets.

View on arXiv PDF Code

Similar