VIPL-HR: A Multi-modal Database for Pulse Estimation from Less-constrained Face Video
This addresses the need for non-contact heart rate monitoring in real-world conditions, though it is incremental by building on existing remote HR estimation methods.
The paper tackles the problem of remote heart rate estimation from face videos in less-constrained scenarios by introducing a large-scale multi-modal database (VIPL-HR) with 2,378 visible light and 752 near-infrared videos, and a deep estimator (RhythmNet) that achieves promising results.
Heart rate (HR) is an important physiological signal that reflects the physical and emotional activities of humans. Traditional HR measurements are mainly based on contact monitors, which are inconvenient and may cause discomfort for the subjects. Recently, methods have been proposed for remote HR estimation from face videos. However, most of the existing methods focus on well-controlled scenarios, their generalization ability into less-constrained scenarios are not known. At the same time, lacking large-scale databases has limited the use of deep representation learning methods in remote HR estimation. In this paper, we introduce a large-scale multi-modal HR database (named as VIPL-HR), which contains 2,378 visible light videos (VIS) and 752 near-infrared (NIR) videos of 107 subjects. Our VIPL-HR database also contains various variations such as head movements, illumination variations, and acquisition device changes. We also learn a deep HR estimator (named as RhythmNet) with the proposed spatial-temporal representation, which achieves promising results on both the public-domain and our VIPL-HR HR estimation databases. We would like to put the VIPL-HR database into the public domain.