Boosting Local Spectro-Temporal Features for Speech Analysis
This work addresses phone classification in speech recognition, but it is incremental as it applies existing object detection methods to a new domain without demonstrating significant improvements.
The paper tackled phone classification for speech recognition by exploring local spectro-temporal features, specifically testing Haar features and SVM-classified Histograms of Gradients (HoG) from object detection, but provided only preliminary results without concrete performance numbers.
We introduce the problem of phone classification in the context of speech recognition, and explore several sets of local spectro-temporal features that can be used for phone classification. In particular, we present some preliminary results for phone classification using two sets of features that are commonly used for object detection: Haar features and SVM-classified Histograms of Gradients (HoG).