Pokemon Red via Reinforcement Learning
This work addresses the problem of developing agents for complex, multi-task environments like classic video games, which is incremental as it builds on existing DRL methods.
The authors tackled the challenge of training a reinforcement learning agent to play Pokémon Red, a complex game with long horizons and hard exploration, and demonstrated a baseline agent that completes an initial segment up to Cerulean City.
Pokémon Red, a classic Game Boy JRPG, presents significant challenges as a testbed for agents, including multi-tasking, long horizons of tens of thousands of steps, hard exploration, and a vast array of potential policies. We introduce a simplistic environment and a Deep Reinforcement Learning (DRL) training methodology, demonstrating a baseline agent that completes an initial segment of the game up to completing Cerulean City. Our experiments include various ablations that reveal vulnerabilities in reward shaping, where agents exploit specific reward signals. We also discuss limitations and argue that games like Pokémon hold strong potential for future research on Large Language Model agents, hierarchical training algorithms, and advanced exploration methods. Source Code: https://github.com/MarcoMeter/neroRL/tree/poke_red