LG MLJul 7, 2022

Multi-objective Optimization of Notifications Using Offline Reinforcement Learning

Prakruthi Prabhakar, Yiping Yuan, Guangyu Yang, Wensheng Sun, Ajith Muralidharan

arXiv:2207.03029v16.97 citationsh-index: 12

Originality Incremental advance

AI Analysis

This work addresses the challenge of balancing multiple objectives in notification systems for mobile applications, representing an incremental improvement in applying offline RL to a specific domain.

The paper tackled the problem of optimizing mobile notification decisions by formulating it as a Markov Decision Process with multiple objectives, and proposed an offline reinforcement learning framework using Conservative Q-learning, resulting in demonstrated performance improvements in both offline and online experiments.

Mobile notification systems play a major role in a variety of applications to communicate, send alerts and reminders to the users to inform them about news, events or messages. In this paper, we formulate the near-real-time notification decision problem as a Markov Decision Process where we optimize for multiple objectives in the rewards. We propose an end-to-end offline reinforcement learning framework to optimize sequential notification decisions. We address the challenge of offline learning using a Double Deep Q-network method based on Conservative Q-learning that mitigates the distributional shift problem and Q-value overestimation. We illustrate our fully-deployed system and demonstrate the performance and benefits of the proposed approach through both offline and online experiments.

View on arXiv PDF

Similar