ASSDOct 4, 2021

WaveBeat: End-to-end beat and downbeat tracking in the time domain

arXiv:2110.01436v113 citations
Originality Highly original
AI Analysis

This addresses the problem of beat tracking for music analysis by introducing a novel time-domain approach, though it is incremental in improving existing deep learning methods.

The authors tackled beat and downbeat tracking in audio by proposing WaveBeat, an end-to-end method that operates directly on waveforms instead of relying on hand-crafted spectral features, achieving state-of-the-art results on some datasets and comparable performance on others.

Deep learning approaches for beat and downbeat tracking have brought advancements. However, these approaches continue to rely on hand-crafted, subsampled spectral features as input, restricting the information available to the model. In this work, we propose WaveBeat, an end-to-end approach for joint beat and downbeat tracking operating directly on waveforms. This method forgoes engineered spectral features, and instead, produces beat and downbeat predictions directly from the waveform, the first of its kind for this task. Our model utilizes temporal convolutional networks (TCNs) operating on waveforms that achieve a very large receptive field ($\geq$ 30 s) at audio sample rates in a memory efficient manner by employing rapidly growing dilation factors with fewer layers. With a straightforward data augmentation strategy, our method outperforms previous state-of-the-art methods on some datasets, while producing comparable results on others, demonstrating the potential for time domain approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes