SDASMay 21, 2018

Speaker Clustering Using Dominant Sets

arXiv:1805.08641v16 citations
Originality Incremental advance
AI Analysis

This work addresses speaker clustering for speech processing, presenting an incremental improvement by adapting an existing algorithm to a new domain.

The paper tackled speaker clustering by applying the Dominant Sets algorithm, a graph-based method not previously used for this task, and achieved state-of-the-art results on the TIMIT dataset under three standard metrics.

Speaker clustering is the task of forming speaker-specific groups based on a set of utterances. In this paper, we address this task by using Dominant Sets (DS). DS is a graph-based clustering algorithm with interesting properties that fits well to our problem and has never been applied before to speaker clustering. We report on a comprehensive set of experiments on the TIMIT dataset against standard clustering techniques and specific speaker clustering methods. Moreover, we compare performances under different features by using ones learned via deep neural network directly on TIMIT and other ones extracted from a pre-trained VGGVox net. To asses the stability, we perform a sensitivity analysis on the free parameters of our method, showing that performance is stable under parameter changes. The extensive experimentation carried out confirms the validity of the proposed method, reporting state-of-the-art results under three different standard metrics. We also report reference baseline results for speaker clustering on the entire TIMIT dataset for the first time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes