LGAIJan 29, 2022

Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes

arXiv:2201.12569v223 citations
Originality Incremental advance
AI Analysis

This work addresses a gap in reinforcement learning for domains with asynchronous event-driven dynamics, offering a novel approach for applications in social media, finance, and health informatics, but it appears incremental as it combines existing techniques (Hawkes processes and Bellman equations) in a new way.

The paper tackles sequential decision-making in environments with stochastic discrete events, such as social media and finance, by introducing a model-based reinforcement learning framework that integrates Hawkes processes with Bellman equations to optimize long-term rewards. The method demonstrates superiority in both synthetic and real-world scenarios, though specific numerical results are not provided in the abstract.

We consider a sequential decision making problem where the agent faces the environment characterized by the stochastic discrete events and seeks an optimal intervention policy such that its long-term reward is maximized. This problem exists ubiquitously in social media, finance and health informatics but is rarely investigated by the conventional research in reinforcement learning. To this end, we present a novel framework of the model-based reinforcement learning where the agent's actions and observations are asynchronous stochastic discrete events occurring in continuous-time. We model the dynamics of the environment by Hawkes process with external intervention control term and develop an algorithm to embed such process in the Bellman equation which guides the direction of the value gradient. We demonstrate the superiority of our method in both synthetic simulator and real-world problem.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes