Joint Text Embedding for Personalized Content-based Recommendation
This addresses the cold-start problem in recommendation systems for domains like news, where new items lack user interaction data, though it is incremental as it builds on existing embedding and recommendation techniques.
The paper tackles the problem of personalized recommendation for new items with text information, where interaction data is unavailable, by proposing a joint text embedding framework that combines text embedding with personalized recommendation. Experimental results show that the model significantly improves recommendation effectiveness on real-world datasets.
Learning a good representation of text is key to many recommendation applications. Examples include news recommendation where texts to be recommended are constantly published everyday. However, most existing recommendation techniques, such as matrix factorization based methods, mainly rely on interaction histories to learn representations of items. While latent factors of items can be learned effectively from user interaction data, in many cases, such data is not available, especially for newly emerged items. In this work, we aim to address the problem of personalized recommendation for completely new items with text information available. We cast the problem as a personalized text ranking problem and propose a general framework that combines text embedding with personalized recommendation. Users and textual content are embedded into latent feature space. The text embedding function can be learned end-to-end by predicting user interactions with items. To alleviate sparsity in interaction data, and leverage large amount of text data with little or no user interactions, we further propose a joint text embedding model that incorporates unsupervised text embedding with a combination module. Experimental results show that our model can significantly improve the effectiveness of recommendation systems on real-world datasets.