ASCLSDMay 31, 2020

Residual Excitation Skewness for Automatic Speech Polarity Detection

arXiv:2006.00525v133 citations
AI Analysis

This addresses a critical problem for speech processing systems dealing with large data from multiple devices, though it is incremental as it builds on existing polarity detection methods.

The paper tackled the problem of automatic speech polarity detection, which is crucial for speech processing performance, by proposing a simple algorithm based on the skewness of excitation signals. The result was an error rate of 0.06% in clean conditions, outperforming four state-of-the-art methods and showing strong robustness in noisy and reverberant environments.

Detecting the correct speech polarity is a necessary step prior to several speech processing techniques. An error on its determination could have a dramatic detrimental impact on their performance. As current systems have to deal with increasing amounts of data stemming from multiple devices, the automatic detection of speech polarity has become a crucial problem. For this purpose, we here propose a very simple algorithm based on the skewness of two excitation signals. The method is shown on 10 speech corpora (8545 files) to lead to an error rate of only 0.06% in clean conditions and to clearly outperform four state-of-the-art methods. Besides it significantly reduces the computational load through its simplicity and is observed to exhibit the strongest robustness in both noisy and reverberant environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes