ASCLLGMLMay 19, 2020

Bayesian Subspace HMM for the Zerospeech 2020 Challenge

arXiv:2005.09282v21 citations
AI Analysis

This work addresses the challenge of unsupervised speech representation learning for synthesis, but it is incremental as it builds on existing HMM methods with subspace constraints.

The authors tackled the problem of discovering latent representations from unannotated speech for synthesis in the Zerospeech 2020 challenge, using a Bayesian Subspace HMM for unit discovery, which achieved favorable human-evaluated character error rates compared to the baseline while maintaining significantly lower unit bitrate.

In this paper we describe our submission to the Zerospeech 2020 challenge, where the participants are required to discover latent representations from unannotated speech, and to use those representations to perform speech synthesis, with synthesis quality used as a proxy metric for the unit quality. In our system, we use the Bayesian Subspace Hidden Markov Model (SHMM) for unit discovery. The SHMM models each unit as an HMM whose parameters are constrained to lie in a low dimensional subspace of the total parameter space which is trained to model phonetic variability. Our system compares favorably with the baseline on the human-evaluated character error rate while maintaining significantly lower unit bitrate.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes