Learnable Adaptive Time-Frequency Representation via Differentiable Short-Time Fourier Transform
This work addresses the challenge of parameter sensitivity in STFT for non-stationary signal analysis, offering a learnable alternative that integrates with neural networks, though it is incremental as it builds on existing STFT methods.
The paper tackled the problem of suboptimal performance in short-time Fourier transform (STFT) due to manual parameter tuning by proposing a differentiable STFT that enables gradient-based optimization, resulting in improved time-frequency representations and downstream task performance as demonstrated on simulated and real-world data.
The short-time Fourier transform (STFT) is widely used for analyzing non-stationary signals. However, its performance is highly sensitive to its parameters, and manual or heuristic tuning often yields suboptimal results. To overcome this limitation, we propose a unified differentiable formulation of the STFT that enables gradient-based optimization of its parameters. This approach addresses the limitations of traditional STFT parameter tuning methods, which often rely on computationally intensive discrete searches. It enables fine-tuning of the time-frequency representation (TFR) based on any desired criterion. Moreover, our approach integrates seamlessly with neural networks, allowing joint optimization of the STFT parameters and network weights. The efficacy of the proposed differentiable STFT in enhancing TFRs and improving performance in downstream tasks is demonstrated through experiments on both simulated and real-world data.