Predicting User Intents and Musical Attributes from Music Discovery Conversations
This addresses the problem of understanding user needs in music-related conversations, which is an incremental advance as it applies existing methods to a new domain.
The paper tackles intent and musical attribute classification in music discovery conversations by using pre-trained language models with chat history concatenation, achieving significant F1 score improvements and outperforming Llama 3 in zero-shot and few-shot settings.
Intent classification is a text understanding task that identifies user needs from input text queries. While intent classification has been extensively studied in various domains, it has not received much attention in the music domain. In this paper, we investigate intent classification models for music discovery conversation, focusing on pre-trained language models. Rather than only predicting functional needs: intent classification, we also include a task for classifying musical needs: musical attribute classification. Additionally, we propose a method of concatenating previous chat history with just single-turn user queries in the input text, allowing the model to understand the overall conversation context better. Our proposed model significantly improves the F1 score for both user intent and musical attribute classification, and surpasses the zero-shot and few-shot performance of the pretrained Llama 3 model.