SD LG ASJun 28, 2022

Domain Agnostic Few-shot Learning for Speaker Verification

Seunghan Yang, Debasmit Das, Janghoon Cho, Hyoungwoo Park, Sungrack Yun

arXiv:2206.13700v17.17 citationsh-index: 30

Originality Incremental advance

AI Analysis

This work addresses generalization issues in speaker verification systems for new users and domains, representing an incremental advancement in domain adaptation methods.

The paper tackles the problem of deep learning models for speaker verification failing to generalize to new users and environments by proposing a few-shot domain generalization framework. The result shows performance improvements on standard benchmarks, with explicit generalization ability demonstrated on artificially generated noise domains.

Deep learning models for verification systems often fail to generalize to new users and new environments, even though they learn highly discriminative features. To address this problem, we propose a few-shot domain generalization framework that learns to tackle distribution shift for new users and new domains. Our framework consists of domain-specific and domain-aggregation networks, which are the experts on specific and combined domains, respectively. By using these networks, we generate episodes that mimic the presence of both novel users and novel domains in the training phase to eventually produce better generalization. To save memory, we reduce the number of domain-specific networks by clustering similar domains together. Upon extensive evaluation on artificially generated noise domains, we can explicitly show generalization ability of our framework. In addition, we apply our proposed methods to the existing competitive architecture on the standard benchmark, which shows further performance improvements.

View on arXiv PDF

Similar