14.1LGMay 11
Voice Biomarkers for Depression and AnxietyOleksii Abramenko, Noah D. Stein, Colin Vaz
Current approaches to detecting depression and anxiety from speech primarily rely on machine learning techniques that utilize hand-engineered paralinguistic features and related acoustic descriptors derived from time- and frequency-domain representations of speech signals. Applying deep learning methods directly to raw speech signals has the potential to produce biomarker representations with substantially greater predictive power. However, these approaches typically require large volumes of carefully annotated data to learn robust and clinically meaningful representations of the underlying biomarkers. In this paper, we describe our efforts toward developing a deep learning model trained on a large-scale proprietary dataset comprising ~65,000 utterances collected from more than 23,000 subjects representative of relevant United States demographics. We present the techniques employed and analyze their impact on model performance. Our results demonstrate that the proposed models can extract content-agnostic biomarker information, which, when combined with lexical features extracted from audio, yields improved predictive performance in production settings. Our models are evaluated on ~5000 unique subjects and achieve performance of 71% in terms of sensitivity and specificity. To foster further research in mental health assessment from speech, we release the best-performing model described in this paper on HuggingFace.
MLMay 15, 2018
Graph Signal Sampling via Reinforcement LearningOleksii Abramenko, Alexander Jung
We formulate the problem of sampling and recovering clustered graph signal as a multi-armed bandit (MAB) problem. This formulation lends naturally to learning sampling strategies using the well-known gradient MAB algorithm. In particular, the sampling strategy is represented as a probability distribution over the individual arms of the MAB and optimized using gradient ascent. Some illustrative numerical experiments indicate that the sampling strategies based on the gradient MAB algorithm outperform existing sampling methods.
LGJan 17, 2017
On the Sample Complexity of Graphical Model Selection for Non-Stationary ProcessesNguyen Q. Tran, Oleksii Abramenko, Alexander Jung
We characterize the sample size required for accurate graphical model selection from non-stationary samples. The observed data is modeled as a vector-valued zero-mean Gaussian random process whose samples are uncorrelated but have different covariance matrices. This model contains as special cases the standard setting of i.i.d. samples as well as the case of samples forming a stationary or underspread (non-stationary) processes. More generally, our model applies to any process model for which an efficient decorrelation can be obtained. By analyzing a particular model selection method, we derive a sufficient condition on the required sample size for accurate graphical model selection based on non-stationary data.