SDNov 5, 2014

An Interesting Property of LPCs for Sonorant Vs Fricative Discrimination

arXiv:1411.1267v14 citations
Originality Incremental advance
AI Analysis

This provides a robust method for speech signal segmentation or non-uniform frame-rate analysis, with potential applications in speech processing, though it appears incremental as it builds on existing LPC techniques.

The paper tackled the problem of discriminating between sonorant and fricative speech sounds by analyzing a property of linear prediction coefficients (LPCs), specifically the inverse-tan of A(1) called SFDI, achieving an accuracy of 99.07% on the TIMIT database.

Linear prediction (LP) technique estimates an optimum all-pole filter of a given order for a frame of speech signal. The coefficients of the all-pole filter, 1/A(z) are referred to as LP coefficients (LPCs). The gain of the inverse of the all-pole filter, A(z) at z = 1, i.e, at frequency = 0, A(1) corresponds to the sum of LPCs, which has the property of being lower (higher) than a threshold for the sonorants (fricatives). When the inverse-tan of A(1), denoted as T(1), is used a feature and tested on the sonorant and fricative frames of the entire TIMIT database, an accuracy of 99.07% is obtained. Hence, we refer to T(1) as sonorant-fricative discrimination index (SFDI). This property has also been tested for its robustness for additive white noise and on the telephone quality speech of the NTIMIT database. These results are comparable to, or in some respects, better than the state-of-the-art methods proposed for a similar task. Such a property may be used for segmenting a speech signal or for non-uniform frame-rate analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes