SDDec 15, 2016

A Phase Vocoder based on Nonstationary Gabor Frames

arXiv:1612.05156v214 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for audio signal processing, specifically in music time stretching, by enhancing the classical phase vocoder with adaptive techniques.

The paper tackles the problem of time stretching music signals by proposing a new phase vocoder algorithm based on nonstationary Gabor frames, which reduces artifacts like phasiness and transient smearing with just three times as many time-frequency coefficients as signal samples.

We propose a new algorithm for time stretching music signals based on the theory of nonstationary Gabor frames (NSGFs). The algorithm extends the techniques of the classical phase vocoder (PV) by incorporating adaptive time-frequency (TF) representations and adaptive phase locking. The adaptive TF representations imply good time resolution for the onsets of attack transients and good frequency resolution for the sinusoidal components. We estimate the phase values only at peak channels and the remaining phases are then locked to the values of the peaks in an adaptive manner. During attack transients we keep the stretch factor equal to one and we propose a new strategy for determining which channels are relevant for reinitializing the corresponding phase values. In contrast to previously published algorithms we use a non-uniform NSGF to obtain a low redundancy of the corresponding TF representation. We show that with just three times as many TF coefficients as signal samples, artifacts such as phasiness and transient smearing can be greatly reduced compared to the classical PV. The proposed algorithm is tested on both synthetic and real world signals and compared with state of the art algorithms in a reproducible manner.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes