ASSDOct 21, 2020

Multi-task Metric Learning for Text-independent Speaker Verification

arXiv:2010.10919v2
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for speaker verification systems.

The paper tackles improving deep embedding learning for text-independent speaker verification by introducing metric learning as an auxiliary loss, and reports effectiveness on the Speaker in the Wild dataset.

In this work, we introduce metric learning (ML) to enhance the deep embedding learning for text-independent speaker verification (SV). Specifically, the deep speaker embedding network is trained with conventional cross entropy loss and auxiliary pair-based ML loss function. For the auxiliary ML task, training samples of a mini-batch are first arranged into pairs, then positive and negative pairs are selected and weighted through their own and relative similarities, and finally the auxiliary ML loss is calculated by the similarity of the selected pairs. To evaluate the proposed method, we conduct experiments on the Speaker in the Wild (SITW) dataset. The results demonstrate the effectiveness of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes