A NIR-to-VIS face recognition via part adaptive and relation attention module
This addresses domain discrepancy in face recognition for surveillance applications, but it is incremental as it builds on existing methods for extracting domain-invariant features.
The paper tackled the problem of heterogeneous face recognition (HFR) between near-infrared (NIR) and visible-light (VIS) images, especially under pose and emotion variations, by proposing a part relation attention module and component adaptive triplet loss, achieving performance improvements on CASIA NIR-VIS 2.0 and superior results on BUAA-VisNir.
In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras. The illumination difference between NIR and visible-light (VIS) causes a domain gap between facial images, and the variations in pose and emotion also make facial matching more difficult. Heterogeneous face recognition (HFR) has difficulties in domain discrepancy, and many studies have focused on extracting domain-invariant features, such as facial part relational information. However, when pose variation occurs, the facial component position changes, and a different part relation is extracted. In this paper, we propose a part relation attention module that crops facial parts obtained through a semantic mask and performs relational modeling using each of these representative features. Furthermore, we suggest component adaptive triplet loss function using adaptive weights for each part to reduce the intra-class identity regardless of the domain as well as pose. Finally, our method exhibits a performance improvement in the CASIA NIR-VIS 2.0 and achieves superior result in the BUAA-VisNir with large pose and emotion variations.