SPHCLGJul 10, 2025

Enhancing Generalization in PPG-Based Emotion Measurement with a CNN-TCN-LSTM Model

arXiv:2507.14173v11 citationsh-index: 30I2MTC
Originality Incremental advance
AI Analysis

This addresses a key problem in affective computing for improving emotion recognition in wearable devices, though it appears incremental as it combines existing neural network components.

The paper tackled the challenge of generalizing PPG-based emotion recognition models across individuals by introducing a hybrid CNN-TCN-LSTM architecture, which outperformed state-of-the-art models like CNN and CNN-LSTM on the PPGE dataset with improved AUC and F1 scores.

Human computer interaction has become integral to modern life, driven by advancements in machine learning technologies. Affective computing, in particular, has focused on systems that recognize, interpret, and respond to human emotions, often using wearable devices, which provide continuous data streams of physiological signals. Among various physiological signals, the photoplethysmogram (PPG) has gained prominence due to its ease of acquisition from widely available devices. However, the generalization of PPG-based emotion recognition models across individuals remains an unresolved challenge. This paper introduces a novel hybrid architecture that combines Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), and Temporal Convolutional Networks (TCNs) to address this issue. The proposed model integrates the strengths of these architectures to improve robustness and generalization. Raw PPG signals are fed into the CNN for feature extraction. These features are processed separately by LSTM and TCN. The outputs from these components are concatenated to generate a final feature representation, which serves as the input for classifying valence and arousal, the primary dimensions of emotion. Experiments using the Photoplethysmogram Dataset for Emotional Analysis (PPGE) demonstrate that the proposed hybrid model achieves better model generalization than standalone CNN and LSTM architectures. Our results show that the proposed solution outperforms the state-of-the-art CNN architecture, as well as a CNN-LSTM model, in emotion recognition tasks with PPG signals. Using metrics such as Area Under the Curve (AUC) and F1 Score, we highlight the model's effectiveness in handling subject variability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes