Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport
This work addresses pitch estimation in audio processing, offering an incremental improvement with enhanced stability and simplicity.
The paper tackles the problem of single pitch estimation by proposing an Optimal Transport objective for learning translation-equivariant systems, resulting in a theoretically grounded, numerically stable, and simpler alternative to existing self-supervised methods.
In this paper, we propose an Optimal Transport objective for learning one-dimensional translation-equivariant systems and demonstrate its applicability to single pitch estimation. Our method provides a theoretically grounded, more numerically stable, and simpler alternative for training state-of-the-art self-supervised pitch estimators.