Solving Relational MDPs with Exogenous Events and Additive Rewards
This work addresses planning challenges in service domains with relational structures, offering incremental improvements with explicit performance guarantees for a specific subclass of problems.
The authors tackled the problem of relational Markov Decision Processes (MDPs) with exogenous events and additive rewards, such as in inventory control, by developing a new symbolic planning algorithm that provides a monotonic lower bound on the optimal value function under certain conditions.
We formalize a simple but natural subclass of service domains for relational planning problems with object-centered, independent exogenous events and additive rewards capturing, for example, problems in inventory control. Focusing on this subclass, we present a new symbolic planning algorithm which is the first algorithm that has explicit performance guarantees for relational MDPs with exogenous events. In particular, under some technical conditions, our planning algorithm provides a monotonic lower bound on the optimal value function. To support this algorithm we present novel evaluation and reduction techniques for generalized first order decision diagrams, a knowledge representation for real-valued functions over relational world states. Our planning algorithm uses a set of focus states, which serves as a training set, to simplify and approximate the symbolic solution, and can thus be seen to perform learning for planning. A preliminary experimental evaluation demonstrates the validity of our approach.