CVMar 14, 2023
Non-Contrastive Unsupervised Learning of Physiological Signals from VideoJeremy Speth, Nathan Vance, Patrick Flynn et al.
Subtle periodic signals such as blood volume pulse and respiration can be extracted from RGB video, enabling remote health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with associated ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to break free from the constraints of labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach is capable of discovering the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases and impressive empirical results, the approach is theoretically capable of discovering other periodic signals from video, enabling multiple physiological measurements without the need for ground truth signals. Codes to fully reproduce the experiments are made available along with the paper.
CVMar 16, 2023
Full-Body Cardiovascular Sensing with Remote PhotoplethysmographyLu Niu, Jeremy Speth, Nathan Vance et al.
Remote photoplethysmography (rPPG) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light. Prior applications of rPPG focused on face videos. In this paper we explored the feasibility of rPPG from non-face body regions such as the arms, legs, and hands. We collected a new dataset titled Multi-Site Physiological Monitoring (MSPM), which will be released with this paper. The dataset consists of 90 frames per second video of exposed arms, legs, and face, along with 10 synchronized PPG recordings. We performed baseline heart rate estimation experiments from non-face regions with several state-of-the-art rPPG approaches, including chrominance-based (CHROM), plane-orthogonal-to-skin (POS) and RemotePulseNet (RPNet). To our knowledge, this is the first evaluation of the fidelity of rPPG signals simultaneously obtained from multiple regions of a human body. Our experiments showed that skin pixels from arms, legs, and hands are all potential sources of the blood volume pulse. The best-performing approach, POS, achieved a mean absolute error peaking at 7.11 beats per minute from non-facial body parts compared to 1.38 beats per minute from the face. Additionally, we performed experiments on pulse transit time (PTT) from both the contact PPG and rPPG signals. We found that remote PTT is possible with moderately high frame rate video when distal locations on the body are visible. These findings and the supporting dataset should facilitate new research on non-face rPPG and monitoring blood flow dynamics over the whole body with a camera.
CVMar 11, 2023
Hallucinated Heartbeats: Anomaly-Aware Remote Pulse EstimationJeremy Speth, Nathan Vance, Benjamin Sporrer et al.
Camera-based physiological monitoring, especially remote photoplethysmography (rPPG), is a promising tool for health diagnostics, and state-of-the-art pulse estimators have shown impressive performance on benchmark datasets. We argue that evaluations of modern solutions may be incomplete, as we uncover failure cases for videos without a live person, or in the presence of severe noise. We demonstrate that spatiotemporal deep learning models trained only with live samples "hallucinate" a genuine-shaped pulse on anomalous and noisy videos, which may have negative consequences when rPPG models are used by medical personnel. To address this, we offer: (a) An anomaly detection model, built on top of the predicted waveforms. We compare models trained in open-set (unknown abnormal predictions) and closed-set (abnormal predictions known when training) settings; (b) An anomaly-aware training regime that penalizes the model for predicting periodic signals from anomalous videos. Extensive experimentation with eight research datasets (rPPG-specific: DDPM, CDDPM, PURE, UBFC, ARPM; deep fakes: DFDC; face presentation attack detection: HKBU-MARs; rPPG outlier: KITTI) show better accuracy of anomaly detection for deep learning models incorporating the proposed training (75.8%), compared to models trained regularly (73.7%) and to hand-crafted rPPG methods (52-62%).
CVFeb 3, 2024
MSPM: A Multi-Site Physiological Monitoring Dataset for Remote Pulse, Respiration, and Blood Pressure EstimationJeremy Speth, Nathan Vance, Benjamin Sporrer et al.
Visible-light cameras can capture subtle physiological biomarkers without physical contact with the subject. We present the Multi-Site Physiological Monitoring (MSPM) dataset, which is the first dataset collected to support the study of simultaneous camera-based vital signs estimation from multiple locations on the body. MSPM enables research on remote photoplethysmography (rPPG), respiration rate, and pulse transit time (PTT); it contains ground-truth measurements of pulse oximetry (at multiple body locations) and blood pressure using contacting sensors. We provide thorough experiments demonstrating the suitability of MSPM to support research on rPPG, respiration rate, and PTT. Cross-dataset rPPG experiments reveal that MSPM is a challenging yet high quality dataset, with intra-dataset pulse rate mean absolute error (MAE) below 4 beats per minute (BPM), and cross-dataset pulse rate MAE below 2 BPM in certain cases. Respiration experiments find a MAE of 1.09 breaths per minute by extracting motion features from the chest. PTT experiments find that across the pairs of different body sites, there is high correlation between remote PTT and contact-measured PTT, which facilitates the possibility for future camera-based PTT research.
CVApr 12, 2024
Measuring Domain Shifts using Deep Learning Remote Photoplethysmography Model SimilarityNathan Vance, Patrick Flynn
Domain shift differences between training data for deep learning models and the deployment context can result in severe performance issues for models which fail to generalize. We study the domain shift problem under the context of remote photoplethysmography (rPPG), a technique for video-based heart rate inference. We propose metrics based on model similarity which may be used as a measure of domain shift, and we demonstrate high correlation between these metrics and empirical performance. One of the proposed metrics with viable correlations, DS-diff, does not assume access to the ground truth of the target domain, i.e. it may be applied to in-the-wild data. To that end, we investigate a model selection problem in which ground truth results for the evaluation domain is not known, demonstrating a 13.9% performance improvement over the average case baseline.
CVApr 20, 2024
SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic SignalsJeremy Speth, Nathan Vance, Patrick Flynn et al.
Subtle periodic signals, such as blood volume pulse and respiration, can be extracted from RGB video, enabling noncontact health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to mitigate the need for labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach discovers the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases, we successfully applied the same approach to camera-based respiration by changing the bandlimits of the target signal. This shows that the approach is general enough for unsupervised learning of bandlimited quasi-periodic signals from different domains. Furthermore, we show that the framework is effective for finetuning models on unlabelled video from a single subject, allowing for personalized and adaptive signal regressors.
CVJan 9, 2024
Refining Remote Photoplethysmography Architectures using CKA and Empirical MethodsNathan Vance, Patrick Flynn
Model architecture refinement is a challenging task in deep learning research fields such as remote photoplethysmography (rPPG). One architectural consideration, the depth of the model, can have significant consequences on the resulting performance. In rPPG models that are overprovisioned with more layers than necessary, redundancies exist, the removal of which can result in faster training and reduced computational load at inference time. With too few layers the models may exhibit sub-optimal error rates. We apply Centered Kernel Alignment (CKA) to an array of rPPG architectures of differing depths, demonstrating that shallower models do not learn the same representations as deeper models, and that after a certain depth, redundant layers are added without significantly increased functionality. An empirical study confirms how the architectural deficiencies discovered using CKA impact performance, and we show how CKA as a diagnostic can be used to refine rPPG architectures.
CVMay 24, 2023
Promoting Generalization in Cross-Dataset Remote PhotoplethysmographyNathan Vance, Jeremy Speth, Benjamin Sporrer et al.
Remote Photoplethysmography (rPPG), or the remote monitoring of a subject's heart rate using a camera, has seen a shift from handcrafted techniques to deep learning models. While current solutions offer substantial performance gains, we show that these models tend to learn a bias to pulse wave features inherent to the training dataset. We develop augmentations to mitigate this learned bias by expanding both the range and variability of heart rates that the model sees while training, resulting in improved model convergence when training and cross-dataset generalization at test time. Through a 3-way cross dataset analysis we demonstrate a reduction in mean absolute error from over 13 beats per minute to below 3 beats per minute. We compare our method with other recent rPPG systems, finding similar performance under a variety of evaluation parameters.
CVOct 21, 2021
Digital and Physical-World Attacks on Remote Pulse DetectionJeremy Speth, Nathan Vance, Patrick Flynn et al.
Remote photoplethysmography (rPPG) is a technique for estimating blood volume changes from reflected light without the need for a contact sensor. We present the first examples of presentation attacks in the digital and physical domains on rPPG from face video. Digital attacks are easily performed by adding imperceptible periodic noise to the input videos. Physical attacks are performed with illumination from visible spectrum LEDs placed in close proximity to the face, while still being difficult to perceive with the human eye. We also show that our attacks extend beyond medical applications, since the method can effectively generate a strong periodic pulse on 3D-printed face masks, which presents difficulties for pulse-based face presentation attack detection (PAD). The paper concludes with ideas for using this work to improve robustness of rPPG methods and pulse-based face PAD.
CVJun 11, 2021
Deception Detection and Remote Physiological Monitoring: A Dataset and Baseline Experimental ResultsJeremy Speth, Nathan Vance, Adam Czajka et al.
We present the Deception Detection and Physiological Monitoring (DDPM) dataset and initial baseline results on this dataset. Our application context is an interview scenario in which the interviewee attempts to deceive the interviewer on selected responses. The interviewee is recorded in RGB, near-infrared, and long-wave infrared, along with cardiac pulse, blood oxygenation, and audio. After collection, data were annotated for interviewer/interviewee, curated, ground-truthed, and organized into train / test parts for a set of canonical deception detection experiments. Baseline experiments found random accuracy for micro-expressions as an indicator of deception, but that saccades can give a statistically significant response. We also estimated subject heart rates from face videos (remotely) with a mean absolute error as low as 3.16 bpm. The database contains almost 13 hours of recordings of 70 subjects, and over 8 million visible-light, near-infrared, and thermal video frames, along with appropriate meta, audio and pulse oximeter data. To our knowledge, this is the only collection offering recordings of five modalities in an interview scenario that can be used in both deception detection and remote photoplethysmography research.
CVJan 11, 2021
Remote Pulse Estimation in the Presence of Face MasksJeremy Speth, Nathan Vance, Patrick Flynn et al.
Remote photoplethysmography (rPPG), a family of techniques for monitoring blood volume changes, may be especially useful for widespread contactless health monitoring using face video from consumer-grade visible-light cameras. The COVID-19 pandemic has caused the widespread use of protective face masks. We found that occlusions from cloth face masks increased the mean absolute error of heart rate estimation by more than 80\% when deploying methods designed on unmasked faces. We show that augmenting unmasked face videos by adding patterned synthetic face masks forces the model to attend to the periocular and forehead regions, improving performance and closing the gap between masked and unmasked pulse estimation. To our knowledge, this paper is the first to analyse the impact of face masks on the accuracy of pulse estimation and offers several novel contributions: (a) 3D CNN-based method designed for remote photoplethysmography in a presence of face masks, (b) two publicly available pulse estimation datasets acquired from 86 unmasked and 61 masked subjects, (c) evaluations of handcrafted algorithms and a 3D CNN trained on videos of unmasked faces and with masks synthetically added, and (d) data augmentation method to add a synthetic mask to a face video.
LGSep 11, 2019
An Online Reinforcement Learning Approach to Quality-Cost-Aware Task Allocation for Multi-Attribute Social SensingYang Zhang, Daniel Zhang, Nathan Vance et al.
Social sensing has emerged as a new sensing paradigm where humans (or devices on their behalf) collectively report measurements about the physical world. This paper focuses on a quality-cost-aware task allocation problem in multi-attribute social sensing applications. The goal is to identify a task allocation strategy (i.e., decide when and where to collect sensing data) to achieve an optimized tradeoff between the data quality and the sensing cost. While recent progress has been made to tackle similar problems, three important challenges have not been well addressed: (i) "online task allocation": the task allocation schemes need to respond quickly to the potentially large dynamics of the measured variables in social sensing; (ii) "multi-attribute constrained optimization": minimizing the overall sensing error given the dependencies and constraints of multiple attributes of the measured variables is a non-trivial problem to solve; (iii) "nonuniform task allocation cost": the task allocation cost in social sensing often has a nonuniform distribution which adds additional complexity to the optimized task allocation problem. This paper develops a Quality-Cost-Aware Online Task Allocation (QCO-TA) scheme to address the above challenges using a principled online reinforcement learning framework. We evaluate the QCO-TA scheme through a real-world social sensing application and the results show that our scheme significantly outperforms the state-of-the-art baselines in terms of both sensing accuracy and cost.
HCMay 18, 2019
When Social Sensing Meets Edge Computing: Vision and ChallengesDaniel Zhang, Nathan Vance, Dong Wang
This paper overviews the state of the art, research challenges, and future opportunities in an emerging research direction: Social Sensing based Edge Computing (SSEC). Social sensing has emerged as a new sensing application paradigm where measurements about the physical world are collected from humans or from devices on their behalf. The advent of edge computing pushes the frontier of computation, service, and data along the cloud-to-things continuum. The merging of these two technical trends generates a set of new research challenges that need to be addressed. In this paper, we first define the new SSEC paradigm that is motivated by a few underlying technology trends. We then present a few representative real-world case studies of SSEC applications and several key research challenges that exist in those applications. Finally, we envision a few exciting research directions in future SSEC. We hope this paper will stimulate discussions of this emerging research direction in the community.