CVAIMar 19, 2024

Compound Expression Recognition via Multi Model Ensemble

arXiv:2403.12572v19 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of complex human emotional expressions for applications in interpersonal interactions, but it is incremental as it applies existing ensemble techniques to a specific domain.

The paper tackled the problem of recognizing compound facial expressions by proposing an ensemble learning method that combines convolutional networks, Vision Transformers, and multi-scale local attention networks, achieving high accuracy on RAF-DB and enabling zero-shot recognition on parts of C-EXPR-DB.

Compound Expression Recognition (CER) plays a crucial role in interpersonal interactions. Due to the existence of Compound Expressions , human emotional expressions are complex, requiring consideration of both local and global facial expressions to make judgments. In this paper, to address this issue, we propose a solution based on ensemble learning methods for Compound Expression Recognition. Specifically, our task is classification, where we train three expression classification models based on convolutional networks, Vision Transformers, and multi-scale local attention networks. Then, through model ensemble using late fusion, we merge the outputs of multiple models to predict the final result. Our method achieves high accuracy on RAF-DB and is able to recognize expressions through zero-shot on certain portions of C-EXPR-DB.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes