Katakomba: Tools and Benchmarks for Data-Driven NetHack
This work addresses resource, implementation, and benchmark challenges for researchers in offline reinforcement learning using NetHack, but it is incremental as it builds on existing datasets and tools.
The authors tackled the low adoption of a large-scale NetHack dataset in offline reinforcement learning by identifying three obstacles and developing an open-source library with D4RL-style tasks, baseline implementations, and evaluation tools to facilitate research.
NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions. One of the promising directions for a breakthrough is using pre-collected datasets similar to recent developments in robotics, recommender systems, and more under the umbrella of offline reinforcement learning (ORL). Recently, a large-scale NetHack dataset was released; while it was a necessary step forward, it has yet to gain wide adoption in the ORL community. In this work, we argue that there are three major obstacles for adoption: resource-wise, implementation-wise, and benchmark-wise. To address them, we develop an open-source library that provides workflow fundamentals familiar to the ORL community: pre-defined D4RL-style tasks, uncluttered baseline implementations, and reliable evaluation tools with accompanying configs and logs synced to the cloud.