ASSDJun 9, 2021

Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization

arXiv:2106.04764v117 citations
Originality Incremental advance
AI Analysis

This work addresses the data labeling bottleneck for speaker diarization systems, particularly in overlapping speech scenarios, though it is incremental as it builds on existing EEND methods.

The paper tackles the problem of training end-to-end neural diarization (EEND) models, which require extensive labeled data for overlapping speech, by proposing a semi-supervised pseudo-labeling approach that uses unlabeled data. The result is a 37.4% relative reduction in diarization error rate on the CALLHOME dataset compared to a seed model, with effectiveness also shown on the DIHARD dataset.

In this paper, we present a semi-supervised training technique using pseudo-labeling for end-to-end neural diarization (EEND). The EEND system has shown promising performance compared with traditional clustering-based methods, especially in the case of overlapping speech. However, to get a well-tuned model, EEND requires labeled data for all the joint speech activities of every speaker at each time frame in a recording. In this paper, we explore a pseudo-labeling approach that employs unlabeled data. First, we propose an iterative pseudo-label method for EEND, which trains the model using unlabeled data of a target condition. Then, we also propose a committee-based training method to improve the performance of EEND. To evaluate our proposed method, we conduct the experiments of model adaptation using labeled and unlabeled data. Experimental results on the CALLHOME dataset show that our proposed pseudo-label achieved a 37.4% relative diarization error rate reduction compared to a seed model. Moreover, we analyzed the results of semi-supervised adaptation with pseudo-labeling. We also show the effectiveness of our approach on the third DIHARD dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes