CVAINov 27, 2025

ARPGNet: Appearance- and Relation-aware Parallel Graph Attention Fusion Network for Facial Expression Recognition

arXiv:2511.22188v11 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving facial expression recognition accuracy for applications like human-computer interaction, though it is incremental by building on existing graph and attention methods.

The paper tackled facial expression recognition by learning spatial-temporal representations that combine facial appearance and region relationships, resulting in ARPGNet outperforming or matching state-of-the-art methods on three datasets.

The key to facial expression recognition is to learn discriminative spatial-temporal representations that embed facial expression dynamics. Previous studies predominantly rely on pre-trained Convolutional Neural Networks (CNNs) to learn facial appearance representations, overlooking the relationships between facial regions. To address this issue, this paper presents an Appearance- and Relation-aware Parallel Graph attention fusion Network (ARPGNet) to learn mutually enhanced spatial-temporal representations of appearance and relation information. Specifically, we construct a facial region relation graph and leverage the graph attention mechanism to model the relationships between facial regions. The resulting relational representation sequences, along with CNN-based appearance representation sequences, are then fed into a parallel graph attention fusion module for mutual interaction and enhancement. This module simultaneously explores the complementarity between different representation sequences and the temporal dynamics within each sequence. Experimental results on three facial expression recognition datasets demonstrate that the proposed ARPGNet outperforms or is comparable to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes