HCFeb 2, 2019

Detecting Gaze Towards Eyes in Natural Social Interactions and its Use in Child Assessment

arXiv:1902.00607v179 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of measuring visual attention for applications like assessing social communication skills in children at risk for developmental disorders such as autism, representing a domain-specific incremental advancement.

The paper tackles the problem of automated eye contact detection in adult-child social interactions using an egocentric camera, achieving a precision of 0.76, recall of 0.80, and area under the precision-recall curve of 0.79, which are significant improvements over existing methods.

Eye contact is a crucial element of non-verbal communication that signifies interest, attention, and participation in social interactions. As a result, measures of eye contact arise in a variety of applications such as the assessment of the social communication skills of children at risk for developmental disorders such as autism, or the analysis of turn-taking and social roles during group meetings. However, the automated measurement of visual attention during naturalistic social interactions is challenging due to the difficulty of estimating a subject's looking direction from video. This paper proposes a novel approach to eye contact detection during adult-child social interactions in which the adult wears a point-of-view camera which captures an egocentric view of the child's behavior. By analyzing the child's face regions and inferring their head pose we can accurately identify the onset and duration of the child's looks to their social partner's eyes. We introduce the Pose-Implicit CNN, a novel deep learning architecture that predicts eye contact while implicitly estimating the head pose. We present a fully automated system for eye contact detection that solves the sub-problems of end-to-end feature learning and pose estimation using deep neural networks. To train our models, we use a dataset comprising 22 hours of 156 play session videos from over 100 children, half of whom are diagnosed with Autism Spectrum Disorder. We report an overall precision of 0.76, recall of 0.80, and an area under the precision-recall curve of 0.79, all of which are significant improvements over existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes