Learning Representations from Product Titles for Modeling Shopping Transactions
This addresses the cold-start problem for e-commerce platforms by improving product modeling for items with short purchase histories or new products.
The paper tackles the cold-start problem in shopping transaction analysis by proposing BASTEXT, a model that learns product representations from textual contents like product titles to capture relationships in baskets, and it outperforms state-of-the-art methods in next product recommendation tasks.
Shopping transaction analysis is important for understanding the shopping behaviors of customers. Existing models such as association rules are poor at modeling products that have short purchase histories and cannot be applied to new products (the cold-start problem). In this paper, we propose BASTEXT, an efficient model of shopping baskets and the texts associated with the products (e.g., product titles). The model's goal is to learn the product representations from the textual contents to capture the relationships between the products in the baskets. Given the products already in a basket, a classifier identifies whether a potential product is relevant to the basket based on their vector representations. This relevancy enables us to learn high-quality representations of the products. The experiments demonstrate that BASTEXT can efficiently model millions of baskets and that it outperforms the state-of-the-art methods in the next product recommendation task. We also show that BASTEXT is a strong baseline for keyword-based product search.