SDLGASApr 25, 2024

COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations

arXiv:2404.16969v422 citationsh-index: 20ICASSP
Originality Incremental advance
AI Analysis

This provides a solution for researchers and developers working on music generation, though it is incremental as it builds on existing contrastive learning and audio processing techniques.

The paper tackles the problem of evaluating generative models for music accompaniment generation by proposing COCOLA, a contrastive learning method that captures harmonic and rhythmic coherence in musical audio representations, and demonstrates its effectiveness in benchmarking recent models.

We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive learning method for musical audio representations that captures the harmonic and rhythmic coherence between samples. Our method operates at the level of the stems composing music tracks and can input features obtained via Harmonic-Percussive Separation (HPS). COCOLA allows the objective evaluation of generative models for music accompaniment generation, which are difficult to benchmark with established metrics. In this regard, we evaluate recent music accompaniment generation models, demonstrating the effectiveness of the proposed method. We release the model checkpoints trained on public datasets containing separate stems (MUSDB18-HQ, MoisesDB, Slakh2100, and CocoChorales).

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes