CLNov 27, 2018

Speaker Diarization With Lexical Information

arXiv:1811.10761v23 citations
Originality Incremental advance
AI Analysis

This work addresses speaker diarization for speech processing, but it is incremental as it builds on existing methods by adding lexical information.

The paper tackles speaker diarization by integrating lexical and acoustic information into speaker clustering, achieving a meaningful improvement over a baseline system on the CALLHOME American English Speech dataset.

This work presents a novel approach to leverage lexical information for speaker diarization. We introduce a speaker diarization system that can directly integrate lexical as well as acoustic information into a speaker clustering process. Thus, we propose an adjacency matrix integration technique to integrate word level speaker turn probabilities with speaker embeddings in a comprehensive way. Our proposed method works without any reference transcript. Words, and word boundary information are provided by an ASR system. We show that our proposed method improves a baseline speaker diarization system solely based on speaker embeddings, achieving a meaningful improvement on the CALLHOME American English Speech dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes