Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines
This addresses the challenge of enabling RL agents to use commonsense knowledge for more efficient decision-making in text-based games, representing an incremental advance with new benchmarks.
The paper tackles the problem of infusing RL agents with commonsense knowledge for text-based games, resulting in agents that perform better and act more efficiently in the new TextWorld Commonsense environment, with user studies indicating room for improvement compared to human performance.
Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement.