CVOct 5, 2022

Learning Video-independent Eye Contact Segmentation from In-the-Wild Videos

arXiv:2210.02033v12.61 citationsh-index: 34Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of generic eye contact detection for social behavior analysis, but it is incremental as it builds on prior video-dependent methods with unsupervised pseudo-labeling.

The paper tackles the problem of detecting one-way eye contact in arbitrary videos by proposing a video-independent temporal segmentation model, achieving 71.88% framewise accuracy on a manually annotated test set of 52 videos.

Human eye contact is a form of non-verbal communication and can have a great influence on social behavior. Since the location and size of the eye contact targets vary across different videos, learning a generic video-independent eye contact detector is still a challenging task. In this work, we address the task of one-way eye contact detection for videos in the wild. Our goal is to build a unified model that can identify when a person is looking at his gaze targets in an arbitrary input video. Considering that this requires time-series relative eye movement information, we propose to formulate the task as a temporal segmentation. Due to the scarcity of labeled training data, we further propose a gaze target discovery method to generate pseudo-labels for unlabeled videos, which allows us to train a generic eye contact segmentation model in an unsupervised way using in-the-wild videos. To evaluate our proposed approach, we manually annotated a test dataset consisting of 52 videos of human conversations. Experimental results show that our eye contact segmentation model outperforms the previous video-dependent eye contact detector and can achieve 71.88% framewise accuracy on our annotated test set. Our code and evaluation dataset are available at https://github.com/ut-vision/Video-Independent-ECS.

View on arXiv PDF Code

Similar