ASSDOct 25, 2019

Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection

arXiv:1910.11646v1107 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving speaker diarization accuracy in multi-speaker recordings for applications like meeting analysis, though it is an incremental advancement.

The paper tackled overlapping speech in diarization by using a neural LSTM-based overlap detection module and a resegmentation step, achieving state-of-the-art performance on multiple corpora and a 20% relative DER reduction on AMI.

We address the problem of effectively handling overlapping speech in a diarization system. First, we detail a neural Long Short-Term Memory-based architecture for overlap detection. Secondly, detected overlap regions are exploited in conjunction with a frame-level speaker posterior matrix to make two-speaker assignments for overlapped frames in the resegmentation step. The overlap detection module achieves state-of-the-art performance on the AMI, DIHARD, and ETAPE corpora. We apply overlap-aware resegmentation on AMI, resulting in a 20% relative DER reduction over the baseline system. While this approach is by no means an end-all solution to overlap-aware diarization, it reveals promising directions for handling overlap.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes