SDLGASMar 7, 2022

Speaker recognition by means of a combination of linear and nonlinear predictive models

arXiv:2203.03190v18 citationsh-index: 34
Originality Incremental advance
AI Analysis

This work addresses speaker recognition accuracy for applications like security or voice interfaces, but it is incremental as it builds on existing predictive models.

The paper tackles speaker recognition by combining linear (LPCC) and nonlinear predictive models, achieving error rate reductions of 2.63% (from 6.31% to 3.68%) with linear residuals and 3.68% with nonlinear residuals compared to using LPCC alone.

This paper deals the combination of nonlinear predictive models with classical LPCC parameterization for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over predictive analysis residual signal gives rise to an improvement over the classical method that considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%. An efficient algorithm for reducing the computational burden is also proposed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes