Frog Soup: Zero-Shot, In-Context, and Sample-Efficient Frogger Agents
This work addresses the challenge of developing general-purpose, adaptable RL agents for gaming, though it is incremental as it focuses on a single game.
The researchers tackled the problem of slow and costly training for reinforcement learning agents in Atari games by demonstrating that reasoning LLMs with out-of-domain RL post-training can play Frogger zero-shot, and they showed that bootstrapping traditional RL with LLM demonstrations improves performance and sample efficiency.
One of the primary aspirations in reinforcement learning research is developing general-purpose agents capable of rapidly adapting to and mastering novel tasks. While RL gaming agents have mastered many Atari games, they remain slow and costly to train for each game. In this work, we demonstrate that latest reasoning LLMs with out-of-domain RL post-training can play a challenging Atari game called Frogger under a zero-shot setting. We then investigate the effect of in-context learning and the amount of reasoning effort on LLM performance. Lastly, we demonstrate a way to bootstrap traditional RL method with LLM demonstrations, which significantly improves their performance and sample efficiency. Our implementation is open sourced at https://github.com/AlienKevin/frogger.