CVFeb 27, 2017

A Dataset for Developing and Benchmarking Active Vision

Phil Ammirato, Patrick Poirson, Eunbyung Park, Jana Kosecka, Alexander C. Berg

arXiv:1702.08272v223.0208 citations

Originality Synthesis-oriented

AI Analysis

This provides a dataset for developing and benchmarking active vision systems in robotics, which is incremental as it builds on existing object detection and active vision research.

The authors tackled the problem of benchmarking active vision for robotics by introducing a new public dataset with over 20,000 RGB-D images and 50,000 bounding boxes in indoor scenes, and they showed that state-of-the-art object detection is still severely impacted by factors like scale and occlusion, while developing a reinforcement learning system for next best move prediction.

We present a new public dataset with a focus on simulating robotic vision tasks in everyday indoor environments using real imagery. The dataset includes 20,000+ RGB-D images and 50,000+ 2D bounding boxes of object instances densely captured in 9 unique scenes. We train a fast object category detector for instance detection on our data. Using the dataset we show that, although increasingly accurate and fast, the state of the art for object detection is still severely impacted by object scale, occlusion, and viewing direction all of which matter for robotics applications. We next validate the dataset for simulating active vision, and use the dataset to develop and evaluate a deep-network-based system for next best move prediction for object classification using reinforcement learning. Our dataset is available for download at cs.unc.edu/~ammirato/active_vision_dataset_website/.

View on arXiv PDF

Similar