SPRIG: Stackelberg Perception-Reinforcement Learning with Internal Game Dynamics
This addresses a specific problem for reinforcement learning agents in environments with varying feature relevance, but it is incremental as it builds on existing policy optimization methods.
The paper tackles the challenge of coordinating perception and decision-making in deep reinforcement learning with high-dimensional sensory inputs by introducing SPRIG, a framework modeling internal interaction as a cooperative Stackelberg game, achieving around 30% higher returns than standard PPO on Atari BeamRider.
Deep reinforcement learning agents often face challenges to effectively coordinate perception and decision-making components, particularly in environments with high-dimensional sensory inputs where feature relevance varies. This work introduces SPRIG (Stackelberg Perception-Reinforcement learning with Internal Game dynamics), a framework that models the internal perception-policy interaction within a single agent as a cooperative Stackelberg game. In SPRIG, the perception module acts as a leader, strategically processing raw sensory states, while the policy module follows, making decisions based on extracted features. SPRIG provides theoretical guarantees through a modified Bellman operator while preserving the benefits of modern policy optimization. Experimental results on the Atari BeamRider environment demonstrate SPRIG's effectiveness, achieving around 30% higher returns than standard PPO through its game-theoretical balance of feature extraction and decision-making.