Towards a Unified Framework for Sequential Decision Making
This work offers a preliminary framework for researchers in AI and machine learning to understand and compare different sequential decision-making methods, though it is incremental as it builds on existing concepts without introducing new paradigms.
The authors tackled the problem of integrating Automated Planning and Reinforcement Learning by proposing a unified framework for Sequential Decision Making, which formulates tasks as sets of training and test Markov Decision Processes to account for generalization and provides a general algorithm and formulas for evaluation.
In recent years, the integration of Automated Planning (AP) and Reinforcement Learning (RL) has seen a surge of interest. To perform this integration, a general framework for Sequential Decision Making (SDM) would prove immensely useful, as it would help us understand how AP and RL fit together. In this preliminary work, we attempt to provide such a framework, suitable for any method ranging from Classical Planning to Deep RL, by drawing on concepts from Probability Theory and Bayesian inference. We formulate an SDM task as a set of training and test Markov Decision Processes (MDPs), to account for generalization. We provide a general algorithm for SDM which we hypothesize every SDM method is based on. According to it, every SDM algorithm can be seen as a procedure that iteratively improves its solution estimate by leveraging the task knowledge available. Finally, we derive a set of formulas and algorithms for calculating interesting properties of SDM tasks and methods, which make possible their empirical evaluation and comparison.