CVDec 2, 2025

RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association

arXiv:2512.02860v11 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses face-voice association for multilingual applications, but it is incremental as it builds on existing fusion and projection techniques.

The paper tackled the face-voice association task in a multilingual environment by revisiting fusion and orthogonal projection to focus on relevant semantic information, achieving an EER of 33.1 and ranking 3rd in the FAME 2026 challenge.

Face-voice association in multilingual environment challenge 2026 aims to investigate the face-voice association task in multilingual scenario. The challenge introduces English-German face-voice pairs to be utilized in the evaluation phase. To this end, we revisit the fusion and orthogonal projection for face-voice association by effectively focusing on the relevant semantic information within the two modalities. Our method performs favorably on the English-German data split and ranked 3rd in the FAME 2026 challenge by achieving the EER of 33.1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes