Modeling Human Eye Movements with Neural Networks in a Maze-Solving Task
This work addresses the problem of understanding computational objectives in human eye movements for researchers in cognitive science and AI, offering a generative model and theory for a specific task.
The researchers tackled the challenge of modeling human eye movements in a maze-solving task by collecting data and building deep generative models, finding that eye movements are best predicted by a model optimized for internal simulation of an object traversing the maze, suggesting humans use mental simulation to solve the task.
From smoothly pursuing moving objects to rapidly shifting gazes during visual search, humans employ a wide variety of eye movement strategies in different contexts. While eye movements provide a rich window into mental processes, building generative models of eye movements is notoriously difficult, and to date the computational objectives guiding eye movements remain largely a mystery. In this work, we tackled these problems in the context of a canonical spatial planning task, maze-solving. We collected eye movement data from human subjects and built deep generative models of eye movements using a novel differentiable architecture for gaze fixations and gaze shifts. We found that human eye movements are best predicted by a model that is optimized not to perform the task as efficiently as possible but instead to run an internal simulation of an object traversing the maze. This not only provides a generative model of eye movements in this task but also suggests a computational theory for how humans solve the task, namely that humans use mental simulation.