CVMay 29
SteerFace: Debiasing Synthetic Face Generation via Adaptive Residue PerturbationYuxi Mi, Qiuyang Yuan, Jianqing Xu et al.
The shortage of legally compliant data for face recognition training has sparked growing interest in using synthetic data as an alternative. While recent diffusion-based methods enable the generation of photorealistic face images with strong identity adherence and data diversity, their downstream recognition performance still exhibits a significant synthetic-real gap. This paper identifies visual tendency as a previously underexplored limitation, whereby synthetic data exhibit an unrealistic prevalence of visual attributes and thus deviate from the real-data distribution. Visual tendency can be attributed to the generator's conditioning on identity embeddings, through which co-occurring residual visual cues are unintentionally absorbed into learned identity semantics. To discourage the generator from exploiting such visual cues, this paper proposes SteerFace, a simple and efficient training framework that perturbs identity embeddings by steering them toward random orthogonal directions on the embedding hypersphere. The perturbation serves as an identity-preserving regularizer that penalizes the generator's reliance on non-identity components, as supported by theoretical analysis. This paper further introduces an adaptive strategy that learns perturbation strengths with both sample-wise preference and favorable overall statistics. Extensive experiments show that SteerFace effectively mitigates visual tendency, outperforms prior methods in downstream face recognition, and generalizes well across different training datasets and generation pipelines.
CVApr 1, 2025
Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided DiffusionYuxi Mi, Zhizhou Zhong, Yuge Huang et al.
Identity-preserving face synthesis aims to generate synthetic face images of virtual subjects that can substitute real-world data for training face recognition models. While prior arts strive to create images with consistent identities and diverse styles, they face a trade-off between them. Identifying their limitation of treating style variation as subject-agnostic and observing that real-world persons actually have distinct, subject-specific styles, this paper introduces MorphFace, a diffusion-based face generator. The generator learns fine-grained facial styles, e.g., shape, pose and expression, from the renderings of a 3D morphable model (3DMM). It also learns identities from an off-the-shelf recognition model. To create virtual faces, the generator is conditioned on novel identities of unlabeled synthetic faces, and novel styles that are statistically sampled from a real-world prior distribution. The sampling especially accounts for both intra-subject variation and subject distinctiveness. A context blending strategy is employed to enhance the generator's responsiveness to identity and style conditions. Extensive experiments show that MorphFace outperforms the best prior arts in face recognition efficacy.
CVOct 11, 2025
ImmerIris: A Large-Scale Dataset and Benchmark for Immersive Iris Recognition in Open ScenesYuxi Mi, Qiuyang Yuan, Zhizhou Zhong et al.
In egocentric applications such as augmented and virtual reality, immersive iris recognition is emerging as an accurate and seamless way to identify persons. While classic systems acquire iris images on-axis, i.e., via dedicated frontal sensors in controlled settings, the immersive setup primarily captures off-axis irises through tilt-placed headset cameras, with only mild control in open scenes. This yields unique challenges, including perspective distortion, intensified quality degradations, and intra-class variations in iris texture. Datasets capturing these challenges remain scarce. To fill this gap, this paper introduces ImmerIris, a large-scale dataset collected via VR headsets, containing 499,791 ocular images from 564 subjects. It is, to the best of current knowledge, the largest public dataset and among the first dedicated to off-axis acquisition. Based on ImmerIris, evaluation protocols are constructed to benchmark recognition methods under different challenging factors. Current methods, primarily designed for classic on-axis imagery, perform unsatisfactorily on the immersive setup, mainly due to reliance on fallible normalization. To this end, this paper further proposes a normalization-free paradigm that directly learns from ocular images with minimal adjustment. Despite its simplicity, this approach consistently outperforms normalization-based counterparts, pointing to a promising direction for robust immersive recognition.