LGJan 22, 2025

State Combinatorial Generalization In Decision Making With Conditional Diffusion Models

arXiv:2501.13241v12 citationsh-index: 9Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This addresses a critical challenge in real-world AI applications like self-driving cars, where combinatorial complexity prevents training on all possible states, though it is incremental as it builds on existing diffusion model techniques.

The paper tackles the problem of zero-shot generalization in combinatorial decision-making, where states are unseen combinations of seen elements, and demonstrates that behavior cloning with conditioned diffusion models outperforms traditional reinforcement learning methods in maze, driving, and multiagent environments.

Many real-world decision-making problems are combinatorial in nature, where states (e.g., surrounding traffic of a self-driving car) can be seen as a combination of basic elements (e.g., pedestrians, trees, and other cars). Due to combinatorial complexity, observing all combinations of basic elements in the training set is infeasible, which leads to an essential yet understudied problem of zero-shot generalization to states that are unseen combinations of previously seen elements. In this work, we first formalize this problem and then demonstrate how existing value-based reinforcement learning (RL) algorithms struggle due to unreliable value predictions in unseen states. We argue that this problem cannot be addressed with exploration alone, but requires more expressive and generalizable models. We demonstrate that behavior cloning with a conditioned diffusion model trained on expert trajectory generalizes better to states formed by new combinations of seen elements than traditional RL methods. Through experiments in maze, driving, and multiagent environments, we show that conditioned diffusion models outperform traditional RL techniques and highlight the broad applicability of our problem formulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes