Learning to Amend Facial Expression Representation via De-albino and Affinity
This work improves facial expression recognition for applications like human-computer interaction, though it is incremental as it builds on existing network architectures.
The paper tackles the problem of facial expression recognition by addressing padding erosion and affinity features, proposing the Amending Representation Module (ARM) which achieves validation accuracies of 90.42% on RAF-DB, 65.2% on Affect-Net, and 58.71% on SFEW, exceeding state-of-the-art methods.
Facial Expression Recognition (FER) is a classification task that points to face variants. Hence, there are certain affinity features between facial expressions, receiving little attention in the FER literature. Convolution padding, despite helping capture the edge information, causes erosion of the feature map simultaneously. After multi-layer filling convolution, the output feature map named albino feature definitely weakens the representation of the expression. To tackle these challenges, we propose a novel architecture named Amending Representation Module (ARM). ARM is a substitute for the pooling layer. Theoretically, it can be embedded in the back end of any network to deal with the Padding Erosion. ARM efficiently enhances facial expression representation from two different directions: 1) reducing the weight of eroded features to offset the side effect of padding, and 2) decomposing facial features to simplify representation learning. Experiments on public benchmarks prove that our ARM boosts the performance of FER remarkably. The validation accuracies are respectively 90.42% on RAF-DB, 65.2% on Affect-Net, and 58.71% on SFEW, exceeding current state-of-the-art methods. Our implementation and trained models are available at https://github.com/JiaweiShiCV/Amend-Representation-Module.