ML LGDec 20, 2013

A Generative Product-of-Filters Model of Audio

Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

arXiv:1312.5857v51 citations

Originality Incremental advance

AI Analysis

This work addresses audio processing challenges by replacing hand-designed decompositions with learned statistical inference, offering potential improvements in tasks like bandwidth expansion and speaker identification, though it appears incremental relative to classic homomorphic filtering.

The authors tackled the problem of audio spectral decomposition by proposing the product-of-filters (PoF) model, a generative approach that learns sparse linear combinations of filters in the log-spectral domain, and demonstrated its effectiveness in bandwidth expansion and as an unsupervised feature extractor for speaker identification.

We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain. PoF makes similar assumptions to those used in the classic homomorphic filtering approach to signal processing, but replaces hand-designed decompositions built of basic signal processing operations with a learned decomposition based on statistical inference. This paper formulates the PoF model and derives a mean-field method for posterior inference and a variational EM algorithm to estimate the model's free parameters. We demonstrate PoF's potential for audio processing on a bandwidth expansion task, and show that PoF can serve as an effective unsupervised feature extractor for a speaker identification task.

View on arXiv PDF

Similar