Learning a Representation for Cover Song Identification Using Convolutional Neural Network
This addresses the challenge of identifying cover songs in Music Information Retrieval, which is incremental as it builds on prior neural network approaches.
The paper tackles the problem of cover song identification by proposing a novel CNN architecture trained with classification strategies and a scheme for robustness against tempo changes, achieving state-of-the-art performance on all public datasets with notable improvements on large datasets.
Cover song identification represents a challenging task in the field of Music Information Retrieval (MIR) due to complex musical variations between query tracks and cover versions. Previous works typically utilize hand-crafted features and alignment algorithms for the task. More recently, further breakthroughs are achieved employing neural network approaches. In this paper, we propose a novel Convolutional Neural Network (CNN) architecture based on the characteristics of the cover song task. We first train the network through classification strategies; the network is then used to extract music representation for cover song identification. A scheme is designed to train robust models against tempo changes. Experimental results show that our approach outperforms state-of-the-art methods on all public datasets, improving the performance especially on the large dataset.