Facial Spatiotemporal Graphs: Leveraging the 3D Facial Surface for Remote Physiological Measurement
This work addresses the challenge of robust physiological measurement from facial videos for applications in healthcare and biometrics, representing a novel modeling paradigm rather than an incremental improvement.
The paper tackled the problem of remote photoplethysmography (rPPG) by proposing a novel representation called Facial Spatiotemporal Graph (STGraph) that aligns processing with the 3D facial surface, and introduced MeshPhys, a lightweight graph convolutional network that achieved state-of-the-art or competitive performance across four benchmark datasets.
Facial remote photoplethysmography (rPPG) methods estimate physiological signals by modeling subtle color changes on the 3D facial surface over time. However, existing methods fail to explicitly align their receptive fields with the 3D facial surface-the spatial support of the rPPG signal. To address this, we propose the Facial Spatiotemporal Graph (STGraph), a novel representation that encodes facial color and structure using 3D facial mesh sequences-enabling surface-aligned spatiotemporal processing. We introduce MeshPhys, a lightweight spatiotemporal graph convolutional network that operates on the STGraph to estimate physiological signals. Across four benchmark datasets, MeshPhys achieves state-of-the-art or competitive performance in both intra- and cross-dataset settings. Ablation studies show that constraining the model's receptive field to the facial surface acts as a strong structural prior, and that surface-aligned, 3D-aware node features are critical for robustly encoding facial surface color. Together, the STGraph and MeshPhys constitute a novel, principled modeling paradigm for facial rPPG, enabling robust, interpretable, and generalizable estimation. Code is available at https://samcantrill.github.io/facial-stgraph-rppg/ .