Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial Puzzles
This work addresses automated reasoning for spatial puzzles, which are incremental improvements in AI planning and reinforcement learning domains.
The paper tackled the problem of solving spatial puzzles by adapting an algorithm combining Answer Set Programming (ASP) and Markov Decision Process (MDP) with heuristics to accelerate learning, showing that the proposed approach speeds up the learning process compared to non-heuristic versions.
Spatial puzzles composed of rigid objects, flexible strings and holes offer interesting domains for reasoning about spatial entities that are common in the human daily-life's activities. The goal of this work is to investigate the automated solution of this kind of puzzles adapting an algorithm that combines Answer Set Programming (ASP) with Markov Decision Process (MDP), algorithm oASP(MDP), to use heuristics accelerating the learning process. ASP is applied to represent the domain as an MDP, while a Reinforcement Learning algorithm (Q-Learning) is used to find the optimal policies. In this work, the heuristics were obtained from the solution of relaxed versions of the puzzles. Experiments were performed on deterministic, non-deterministic and non-stationary versions of the puzzles. Results show that the proposed approach can accelerate the learning process, presenting an advantage when compared to the non-heuristic versions of oASP(MDP) and Q-Learning.