Joel Casimiro

14.4CVMay 22, 2021Code

GOO: A Dataset for Gaze Object Prediction in Retail Environments

Henri Tomas, Marcus Reyes, Raimarc Dionido et al.

One of the most fundamental and information-laden actions humans do is to look at objects. However, a survey of current works reveals that existing gaze-related datasets annotate only the pixel being looked at, and not the boundaries of a specific object of interest. This lack of object annotation presents an opportunity for further advancing gaze estimation research. To this end, we present a challenging new task called gaze object prediction, where the goal is to predict a bounding box for a person's gazed-at object. To train and evaluate gaze networks on this task, we present the Gaze On Objects (GOO) dataset. GOO is composed of a large set of synthetic images (GOO Synth) supplemented by a smaller subset of real images (GOO-Real) of people looking at objects in a retail environment. Our work establishes extensive baselines on GOO by re-implementing and evaluating selected state-of-the art models on the task of gaze following and domain adaptation. Code is available on github.

14.7CVAug 28, 2020Code

Next-Best View Policy for 3D Reconstruction

Daryl Peralta, Joel Casimiro, Aldrin Michael Nilles et al.

Manually selecting viewpoints or using commonly available flight planners like circular path for large-scale 3D reconstruction using drones often results in incomplete 3D models. Recent works have relied on hand-engineered heuristics such as information gain to select the Next-Best Views. In this work, we present a learning-based algorithm called Scan-RL to learn a Next-Best View (NBV) Policy. To train and evaluate the agent, we created Houses3K, a dataset of 3D house models. Our experiments show that using Scan-RL, the agent can scan houses with fewer number of steps and a shorter distance compared to our baseline circular path. Experimental results also demonstrate that a single NBV policy can be used to scan multiple houses including those that were not seen during training. The link to Scan-RL is available at https://github.com/darylperalta/ScanRL and Houses3K dataset can be found at https://github.com/darylperalta/Houses3K.

Joel Casimiro

2 Papers