MLSDMar 26, 2014

Constrained speaker linking

arXiv:1403.7084v2
Originality Incremental advance
AI Analysis

This addresses speaker linking in constrained scenarios like telephone conversations, offering a practical solution for database annotation, though it is incremental as it builds on existing Bayesian and recognition methods.

The paper tackled the speaker linking problem with constraints on speaker identity distribution, showing it becomes tractable when data is pre-partitioned into non-overlapping cliques. For the Dutch CGN database, a lightweight speaker recognition system achieved 93% accuracy in solving the channel assignment task.

In this paper we study speaker linking (a.k.a.\ partitioning) given constraints of the distribution of speaker identities over speech recordings. Specifically, we show that the intractable partitioning problem becomes tractable when the constraints pre-partition the data in smaller cliques with non-overlapping speakers. The surprisingly common case where speakers in telephone conversations are known, but the assignment of channels to identities is unspecified, is treated in a Bayesian way. We show that for the Dutch CGN database, where this channel assignment task is at hand, a lightweight speaker recognition system can quite effectively solve the channel assignment problem, with 93% of the cliques solved. We further show that the posterior distribution over channel assignment configurations is well calibrated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes