LGSDMLJun 27, 2012

Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

arXiv:1206.6468v127 citations
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck for audio source separation, making it more efficient for real-world applications, though it is incremental as it builds on existing models.

The paper tackles the exponential complexity of inference in non-negative factorial hidden Markov models for audio source separation by introducing a Bayesian variant with a variational inference algorithm that reduces complexity to linear in the number of sources, achieving a 30x speed increase while maintaining comparable performance to exact inference.

The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a temporal dimension and improves source separation performance. However, the factorial nature of this model makes the complexity of inference exponential in the number of sound sources. Here, we present a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources. Our algorithm performs comparably to exact inference in the original N-FHMM but is significantly faster. In typical configurations of the N-FHMM, our method achieves around a 30x increase in speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes