Learning Human Search Behavior from Egocentric Visual Inputs
This work addresses the challenge of creating intelligent virtual agents with humanlike search behaviors for applications in robotics and virtual environments, though it is a first step and incremental in nature.
The paper tackles the problem of enabling a virtual human character to search for randomly located target objects in 3D indoor scenes using only egocentric RGBD vision, without privileged 3D information, resulting in natural navigation and effective finding of often occluded household items, with the same search policy applicable to different full-body characters without retraining.
"Looking for things" is a mundane but critical task we repeatedly carry on in our daily life. We introduce a method to develop a human character capable of searching for a randomly located target object in a detailed 3D scene using its locomotion capability and egocentric vision perception represented as RGBD images. By depriving the privileged 3D information from the human character, it is forced to move and look around simultaneously to account for the restricted sensing capability, resulting in natural navigation and search behaviors. Our method consists of two components: 1) a search control policy based on an abstract character model, and 2) an online replanning control module for synthesizing detailed kinematic motion based on the trajectories planned by the search policy. We demonstrate that the combined techniques enable the character to effectively find often occluded household items in indoor environments. The same search policy can be applied to different full-body characters without the need for retraining. We evaluate our method quantitatively by testing it on randomly generated scenarios. Our work is a first step toward creating intelligent virtual agents with humanlike behaviors driven by onboard sensors, paving the road toward future robotic applications.