SD AI ASDec 29, 2024

Ensemble of classifiers for speech evaluation

G. Belokrylov, A. Korenev, B. Lodonova, A. Novokhrestov

arXiv:2501.00067v1h-index: 1

Originality Synthesis-oriented

AI Analysis

This work addresses speech evaluation for medical applications, but it is incremental as it applies standard ensemble methods to a new dataset.

The authors tackled the problem of assessing speech quality in medical contexts by applying an ensemble of binary classifiers to a dataset with quantitative metrics and expert labels, resulting in a slight increase in classification accuracy compared to individual classifiers.

The article describes an attempt to apply an ensemble of binary classifiers to solve the problem of speech assessment in medicine. A dataset was compiled based on quantitative and expert assessments of syllable pronunciation quality. Quantitative assessments of 7 selected metrics were used as features: dynamic time warp distance, Minkowski distance, correlation coefficient, longest common subsequence (LCSS), edit distance of real se-quence (EDR), edit distance with real penalty (ERP), and merge split (MSM). Expert as-sessment of pronunciation quality was used as a class label: class 1 means high-quality speech, class 0 means distorted. A comparison of training results was carried out for five classification methods: logistic regression (LR), support vector machine (SVM), naive Bayes (NB), decision trees (DT), and K-nearest neighbors (KNN). The results of using the mixture method to build an ensemble of classifiers are also presented. The use of an en-semble for the studied data sets allowed us to slightly increase the classification accuracy compared to the use of individual binary classifiers.

View on arXiv PDF

Similar