DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes
This provides a new dataset for computer vision researchers working on depth estimation and 3D scene understanding, though it is incremental as it expands existing RGB-D data collections.
The authors introduced the DIML/CVL RGB-D dataset, which contains 2 million RGB-D images from diverse natural indoor and outdoor scenes, addressing the need for large-scale depth data. The dataset was collected using Microsoft Kinect v2 for indoor scenes and stereo cameras for outdoor scenes.
This manual is intended to provide a detailed description of the DIML/CVL RGB-D dataset. This dataset is comprised of 2M color images and their corresponding depth maps from a great variety of natural indoor and outdoor scenes. The indoor dataset was constructed using the Microsoft Kinect v2, while the outdoor dataset was built using the stereo cameras (ZED stereo camera and built-in stereo camera). Table I summarizes the details of our dataset, including acquisition, processing, format, and toolbox. Refer to Section II and III for more details.