LGMLJul 7, 2022

Multi-objective Optimization of Notifications Using Offline Reinforcement Learning

arXiv:2207.03029v17 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses the challenge of balancing multiple objectives in notification systems for mobile applications, representing an incremental improvement in applying offline RL to a specific domain.

The paper tackled the problem of optimizing mobile notification decisions by formulating it as a Markov Decision Process with multiple objectives, and proposed an offline reinforcement learning framework using Conservative Q-learning, resulting in demonstrated performance improvements in both offline and online experiments.

Mobile notification systems play a major role in a variety of applications to communicate, send alerts and reminders to the users to inform them about news, events or messages. In this paper, we formulate the near-real-time notification decision problem as a Markov Decision Process where we optimize for multiple objectives in the rewards. We propose an end-to-end offline reinforcement learning framework to optimize sequential notification decisions. We address the challenge of offline learning using a Double Deep Q-network method based on Conservative Q-learning that mitigates the distributional shift problem and Q-value overestimation. We illustrate our fully-deployed system and demonstrate the performance and benefits of the proposed approach through both offline and online experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes