Online inductive learning from answer sets for efficient reinforcement learning exploration
This work addresses the challenge of improving training performance and explainability in reinforcement learning for agents in complex environments, representing an incremental advance by integrating existing methods in a novel way.
The paper tackles the problem of inefficient exploration in reinforcement learning by combining inductive logic programming with Q-learning to learn explainable logical rules from experience, which guide exploration and boost discounted returns in Pac-Man scenarios without increasing computational time.
This paper presents a novel approach combining inductive logic programming with reinforcement learning to improve training performance and explainability. We exploit inductive learning of answer set programs from noisy examples to learn a set of logical rules representing an explainable approximation of the agent policy at each batch of experience. We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch, without requiring inefficient reward shaping and preserving optimality with soft bias. The entire procedure is conducted during the online execution of the reinforcement learning algorithm. We preliminarily validate the efficacy of our approach by integrating it into the Q-learning algorithm for the Pac-Man scenario in two maps of increasing complexity. Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training. Moreover, inductive learning does not compromise the computational time required by Q-learning and learned rules quickly converge to an explanation of the agent policy.