A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
This addresses the need for integrated spoofing-aware speaker verification systems, which is an incremental advancement over existing standalone countermeasures.
The paper tackles the problem of voice spoofing attacks degrading speaker verification systems by proposing a probabilistic fusion framework to integrate spoofing countermeasures, resulting in a significant improvement in the SASV equal error rate from 19.31% to 1.53% on official evaluation trials.
The performance of automatic speaker verification (ASV) systems could be degraded by voice spoofing attacks. Most existing works aimed to develop standalone spoofing countermeasure (CM) systems. Relatively little work targeted at developing an integrated spoofing aware speaker verification (SASV) system. In the recent SASV challenge, the organizers encourage the development of such integration by releasing official protocols and baselines. In this paper, we build a probabilistic framework for fusing the ASV and CM subsystem scores. We further propose fusion strategies for direct inference and fine-tuning to predict the SASV score based on the framework. Surprisingly, these strategies significantly improve the SASV equal error rate (EER) from 19.31% of the baseline to 1.53% on the official evaluation trials of the SASV challenge. We verify the effectiveness of our proposed components through ablation studies and provide insights with score distribution analysis.