HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty
This addresses head pose estimation for applications like social interaction analysis, but it is incremental as it builds on existing keypoint-based methods with efficiency improvements.
The paper tackles head pose estimation from single images using keypoints, achieving comparable accuracy while being faster and smaller than state-of-the-art methods, and provides heteroscedastic uncertainty estimates correlated with errors.
In this paper we introduce a novel method to estimate the head pose of people in single images starting from a small set of head keypoints. To this purpose, we propose a regression model that exploits keypoints computed automatically by 2D pose estimation algorithms and outputs the head pose represented by yaw, pitch, and roll. Our model is simple to implement and more efficient with respect to the state of the art -- faster in inference and smaller in terms of memory occupancy -- with comparable accuracy. Our method also provides a measure of the heteroscedastic uncertainties associated with the three angles, through an appropriately designed loss function; we show there is a correlation between error and uncertainty values, thus this extra source of information may be used in subsequent computational steps. As an example application, we address social interaction analysis in images: we propose an algorithm for a quantitative estimation of the level of interaction between people, starting from their head poses and reasoning on their mutual positions. The code is available at https://github.com/cantarinigiorgio/HHP-Net.