Representation Learning for Image-based Music Recommendation
This addresses the problem of contextual music recommendation for users by leveraging image data, but it appears incremental as it builds on existing representation learning methods for cross-modal tasks.
The paper tackled the problem of bridging the heterogeneity gap between music and image data for contextual recommendation, proposing a representation learning framework that retrieves relevant or conceptually similar songs for input images in preliminary experiments.
Image perception is one of the most direct ways to provide contextual information about a user concerning his/her surrounding environment; hence images are a suitable proxy for contextual recommendation. We propose a novel representation learning framework for image-based music recommendation that bridges the heterogeneity gap between music and image data; the proposed method is a key component for various contextual recommendation tasks. Preliminary experiments show that for an image-to-song retrieval task, the proposed method retrieves relevant or conceptually similar songs for input images.