CLMar 23, 2021

A General Framework for Learning Prosodic-Enhanced Representation of Rap Lyrics

arXiv:2103.12615v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of analyzing rap lyrics for applications like music recommendation, but it is incremental as it builds on existing methods by adding prosodic features.

The paper tackles the problem of learning rap lyrics representations by proposing a hierarchical attention variational autoencoder (HAVAE) that integrates semantic and prosodic features, resulting in outperforming state-of-the-art approaches across various metrics and tasks.

Learning and analyzing rap lyrics is a significant basis for many web applications, such as music recommendation, automatic music categorization, and music information retrieval, due to the abundant source of digital music in the World Wide Web. Although numerous studies have explored the topic, knowledge in this field is far from satisfactory, because critical issues, such as prosodic information and its effective representation, as well as appropriate integration of various features, are usually ignored. In this paper, we propose a hierarchical attention variational autoencoder framework (HAVAE), which simultaneously consider semantic and prosodic features for rap lyrics representation learning. Specifically, the representation of the prosodic features is encoded by phonetic transcriptions with a novel and effective strategy~(i.e., rhyme2vec). Moreover, a feature aggregation strategy is proposed to appropriately integrate various features and generate prosodic-enhanced representation. A comprehensive empirical evaluation demonstrates that the proposed framework outperforms the state-of-the-art approaches under various metrics in different rap lyrics learning tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes