CVNov 13, 2025

Batch Transformer Architecture: Case of Synthetic Image Generation for Emotion Expression Facial Recognition

arXiv:2511.11754v1Athens J Sci

Originality Incremental advance

AI Analysis

This work addresses data augmentation for facial recognition, particularly in cases with makeup and occlusion, but appears incremental as it modifies existing Transformer methods.

The authors tackled the problem of limited data variability in facial recognition by proposing a Batch Transformer architecture that reduces bottleneck size through attention to important dimensions, achieving increased variability on a makeup and occlusion dataset.

A novel Transformer variation architecture is proposed in the implicit sparse style. Unlike "traditional" Transformers, instead of attention to sequential or batch entities in their entirety of whole dimensionality, in the proposed Batch Transformers, attention to the "important" dimensions (primary components) is implemented. In such a way, the "important" dimensions or feature selection allows for a significant reduction of the bottleneck size in the encoder-decoder ANN architectures. The proposed architecture is tested on the synthetic image generation for the face recognition task in the case of the makeup and occlusion data set, allowing for increased variability of the limited original data set.

View on arXiv PDF

Similar