Sriraj Badam

IR
h-index5
3papers
24citations
Novelty50%
AI Score29

3 Papers

IRSep 30, 2022
Reward Shaping for User Satisfaction in a REINFORCE Recommender

Konstantina Christakopoulou, Can Xu, Sai Zhang et al.

How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction? Three research questions are key: (1) measuring user satisfaction, (2) combatting sparsity of satisfaction signals, and (3) adapting the training of the recommender agent to maximize satisfaction. For measurement, it has been found that surveys explicitly asking users to rate their experience with consumed items can provide valuable orthogonal information to the engagement/interaction data, acting as a proxy to the underlying user satisfaction. For sparsity, i.e, only being able to observe how satisfied users are with a tiny fraction of user-item interactions, imputation models can be useful in predicting satisfaction level for all items users have consumed. For learning satisfying recommender policies, we postulate that reward shaping in RL recommender agents is powerful for driving satisfying user experiences. Putting everything together, we propose to jointly learn a policy network and a satisfaction imputation network: The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend, with the reward utilizing the imputed satisfaction. We use both offline analysis and live experiments in an industrial large-scale recommendation platform to demonstrate the promise of our approach for satisfying user experiences.

IRMay 20, 2024
Beyond Item Dissimilarities: Diversifying by Intent in Recommender Systems

Yuyan Wang, Cheenar Banerjee, Samer Chucri et al.

It has become increasingly clear that recommender systems that overly focus on short-term engagement prevents users from exploring diverse interests, ultimately hurting long-term user experience. To tackle this challenge, numerous diversification algorithms have been proposed. These algorithms typically rely on measures of item similarity, aiming to maximize the dissimilarity across items in the final set of recommendations. However, in this work, we demonstrate the benefits of going beyond item-level similarities by utilizing higher-level user understanding--specifically, user intents that persist across multiple interactions--in diversification. Our approach is motivated by the observation that user behaviors on online platforms are largely driven by their underlying intents. Therefore, recommendations should ensure that diverse user intents are accurately represented. While intent has primarily been studied in the context of search, it is less clear how to incorporate real-time dynamic intent predictions into recommender systems. To address this gap, we develop a probabilistic intent-based whole-page diversification framework for the final stage of a recommender system. Starting with a prior belief of user intents, the proposed framework sequentially selects items for each position based on these beliefs and subsequently updates posterior beliefs about the intents. This approach ensures that different user intents are represented on a page, towards optimizing long-term user experience. We experiment with the intent diversification framework on YouTube, the world's largest video recommendation platform, serving billions of users daily. Live experiments on a diverse set of intents show that the proposed framework increases Daily Active Users (DAU) and overall user enjoyment, validating its effectiveness in facilitating long-term planning.

IRJan 26, 2022
Recency Dropout for Recurrent Recommender Systems

Bo Chang, Can Xu, Matthieu Lê et al.

Recurrent recommender systems have been successful in capturing the temporal dynamics in users' activity trajectories. However, recurrent neural networks (RNNs) are known to have difficulty learning long-term dependencies. As a consequence, RNN-based recommender systems tend to overly focus on short-term user interests. This is referred to as the recency bias, which could negatively affect the long-term user experience as well as the health of the ecosystem. In this paper, we introduce the recency dropout technique, a simple yet effective data augmentation technique to alleviate the recency bias in recurrent recommender systems. We demonstrate the effectiveness of recency dropout in various experimental settings including a simulation study, offline experiments, as well as live experiments on a large-scale industrial recommendation platform.