Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network
This work addresses a challenge in NLP for organizing Brazilian music, but it is incremental as it applies existing methods to a new dataset.
The paper tackles automatic music genre classification for Brazilian songs using only lyrics, achieving an average F1-score of 0.48 with a BLSTM network, where specific genres like gospel reached up to 0.89.
Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context.