Pop2Piano : Pop Audio-based Piano Cover Generation
This addresses the lack of automated piano cover generation for pop music, which is enjoyed by many, but it is incremental as it builds on existing data-driven and Transformer methods.
The paper tackles the problem of automatically generating piano covers from pop music audio by creating a large synchronized dataset and introducing Pop2Piano, a Transformer model that directly generates covers without intermediate modules, producing plausible results.
Piano covers of pop music are enjoyed by many people. However, the task of automatically generating piano covers of pop music is still understudied. This is partly due to the lack of synchronized {Pop, Piano Cover} data pairs, which made it challenging to apply the latest data-intensive deep learning-based methods. To leverage the power of the data-driven approach, we make a large amount of paired and synchronized {Pop, Piano Cover} data using an automated pipeline. In this paper, we present Pop2Piano, a Transformer network that generates piano covers given waveforms of pop music. To the best of our knowledge, this is the first model to generate a piano cover directly from pop audio without using melody and chord extraction modules. We show that Pop2Piano, trained with our dataset, is capable of producing plausible piano covers.