LGSDASMLFeb 2, 2018

A Generative Model for Natural Sounds Based on Latent Force Modelling

arXiv:1802.00680v22 citations
AI Analysis

This work addresses the challenge of generating realistic natural sounds for audio synthesis applications, representing an incremental improvement by integrating prior physical knowledge into existing probabilistic methods.

The authors tackled the problem of modeling natural sounds by incorporating physical knowledge of amplitude envelope behavior into a generative model, resulting in sounds that were perceived as more realistic than those from nonnegative matrix factorization models, despite sometimes higher reconstruction errors.

Recent advances in analysis of subband amplitude envelopes of natural sounds have resulted in convincing synthesis, showing subband amplitudes to be a crucial component of perception. Probabilistic latent variable analysis is particularly revealing, but existing approaches don't incorporate prior knowledge about the physical behaviour of amplitude envelopes, such as exponential decay and feedback. We use latent force modelling, a probabilistic learning paradigm that incorporates physical knowledge into Gaussian process regression, to model correlation across spectral subband envelopes. We augment the standard latent force model approach by explicitly modelling correlations over multiple time steps. Incorporating this prior knowledge strengthens the interpretation of the latent functions as the source that generated the signal. We examine this interpretation via an experiment which shows that sounds generated by sampling from our probabilistic model are perceived to be more realistic than those generated by similar models based on nonnegative matrix factorisation, even in cases where our model is outperformed from a reconstruction error perspective.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes