End-to-end Music Remastering System Using Self-supervised and Adversarial Training
This work addresses the high entry barriers in music production for artists and producers by automating remastering, though it is incremental as it builds on existing self-supervised and adversarial methods.
The paper tackles the problem of automating music remastering, which traditionally requires expert audio engineers, by proposing an end-to-end system that transforms input audio to match a target mastering style, and it shows the model generates samples with similar mastering styles to the target.
Mastering is an essential step in music production, but it is also a challenging task that has to go through the hands of experienced audio engineers, where they adjust tone, space, and volume of a song. Remastering follows the same technical process, in which the context lies in mastering a song for the times. As these tasks have high entry barriers, we aim to lower the barriers by proposing an end-to-end music remastering system that transforms the mastering style of input audio to that of the target. The system is trained in a self-supervised manner, in which released pop songs were used for training. We also anticipated the model to generate realistic audio reflecting the reference's mastering style by applying a pre-trained encoder and a projection discriminator. We validate our results with quantitative metrics and a subjective listening test and show that the model generated samples of mastering style similar to the target.