LGROJul 16, 2024

Bellman Diffusion Models

arXiv:2407.12163v24 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for reinforcement learning researchers, focusing on a specific domain application.

The paper tackled the problem of modeling the successor state measure in reinforcement learning by enforcing Bellman flow constraints on diffusion models, resulting in a simple Bellman update on the diffusion step distribution.

Diffusion models have seen tremendous success as generative architectures. Recently, they have been shown to be effective at modelling policies for offline reinforcement learning and imitation learning. We explore using diffusion as a model class for the successor state measure (SSM) of a policy. We find that enforcing the Bellman flow constraints leads to a simple Bellman update on the diffusion step distribution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes