Feudal Reinforcement Learning by Reading Manuals
This addresses the problem of enabling AI agents to effectively reason from concise instructions in complex environments, though it is incremental as it builds on existing feudal and multi-hop reasoning approaches.
The paper tackles the challenge of semantic mismatch between high-level language instructions and low-level actions in reading-to-act tasks by introducing a Feudal Reinforcement Learning model with a manager-worker architecture, achieving competitive performance on RTFM and Messenger tasks without human-designed curriculum.
Reading to act is a prevalent but challenging task which requires the ability to reason from a concise instruction. However, previous works face the semantic mismatch between the low-level actions and the high-level language descriptions and require the human-designed curriculum to work properly. In this paper, we present a Feudal Reinforcement Learning (FRL) model consisting of a manager agent and a worker agent. The manager agent is a multi-hop plan generator dealing with high-level abstract information and generating a series of sub-goals in a backward manner. The worker agent deals with the low-level perceptions and actions to achieve the sub-goals one by one. In comparison, our FRL model effectively alleviate the mismatching between text-level inference and low-level perceptions and actions; and is general to various forms of environments, instructions and manuals; and our multi-hop plan generator can significantly boost for challenging tasks where multi-step reasoning form the texts is critical to resolve the instructed goals. We showcase our approach achieves competitive performance on two challenging tasks, Read to Fight Monsters (RTFM) and Messenger, without human-designed curriculum learning.