A differentiable short-time Fourier transform with respect to the window length
This addresses the problem of hyperparameter tuning in STFT-based signal processing for researchers and practitioners, though it is incremental as it builds on existing STFT methods.
The paper tackles the problem of optimizing the window length parameter in spectrograms used by neural networks by making it a continuous, differentiable parameter instead of an empirically tuned integer. The result is a theoretical framework for a differentiable STFT that can be plugged into existing networks, with potential benefits shown on estimation and classification tasks.
In this paper, we revisit the use of spectrograms in neural networks, by making the window length a continuous parameter optimizable by gradient descent instead of an empirically tuned integer-valued hyperparameter. The contribution is mostly theoretical at this point, but plugging the modified STFT into any existing neural network is straightforward. We first define a differentiable version of the STFT in the case where local bins centers are fixed and independent of the window length parameter. We then discuss the more difficult case where the window length affects the position and number of bins. We illustrate the benefits of this new tool on an estimation and a classification problems, showing it can be of interest not only to neural networks but to any STFT-based signal processing algorithm.