Estimating Central, Peripheral, and Temporal Visual Contributions to Human Decision Making in Atari Games
This provides insights into human visual processing for researchers in cognitive science and AI, though it is incremental as it applies an existing ablation framework to a new dataset.
The study tackled the problem of quantifying how different visual information sources contribute to human decision-making in dynamic environments like Atari games, finding that peripheral information had the strongest impact with accuracy drops of 35.27-43.90% when removed, while gaze and past-state information showed smaller effects.
We study how different visual information sources contribute to human decision making in dynamic visual environments. Using Atari-HEAD, a large-scale Atari gameplay dataset with synchronized eye-tracking, we introduce a controlled ablation framework as a means to reverse-engineer the contribution of peripheral visual information, explicit gaze information in form of gaze maps, and past-state information from human behavior. We train action-prediction networks under six settings that selectively include or exclude these information sources. Across 20 games, peripheral information shows by far the strongest contribution, with median prediction-accuracy drops in the range of 35.27-43.90% when removed. Gaze information yields smaller drops of 2.11-2.76%, while past-state information shows a broader range of 1.52-15.51%, with the upper end likely more informative due to reduced peripheral-information leakage. To complement aggregate accuracies, we cluster states by true-action probabilities assigned by the different model configurations. This analysis identifies coarse behavioral regimes, including focus-dominated, periphery-dominated, and more contextual decision situations. These results suggest that human decision making in Atari depends strongly on information beyond the current focus of gaze, while the proposed framework provides a way to estimate such information-source contributions from behavior.