AICLMMMay 12, 2023

Unsupervised Melody-Guided Lyrics Generation

arXiv:2305.07760v2
Originality Incremental advance
AI Analysis

This addresses the challenge of cross-modal correlation in automatic songwriting for users with limited music background, though it is incremental as it builds on prior lyric generation methods.

The paper tackled the problem of generating singable lyrics without using aligned melody-lyric training data by proposing a hierarchical framework that disentangles text-based training from melody-guided inference, resulting in lyrics that are more singable, intelligible, coherent, and in rhyme than supervised baselines.

Automatic song writing is a topic of significant practical interest. However, its research is largely hindered by the lack of training data due to copyright concerns and challenged by its creative nature. Most noticeably, prior works often fall short of modeling the cross-modal correlation between melody and lyrics due to limited parallel data, hence generating lyrics that are less singable. Existing works also lack effective mechanisms for content control, a much desired feature for democratizing song creation for people with limited music background. In this work, we propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data. Instead, we design a hierarchical lyric generation framework that disentangles training (based purely on text) from inference (melody-guided text generation). At inference time, we leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process. Evaluation results show that our model can generate high-quality lyrics that are more singable, intelligible, coherent, and in rhyme than strong baselines including those supervised on parallel data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes