EDEN: Multimodal Synthetic Dataset of Enclosed GarDEN Scenes
This dataset addresses a gap for nature-oriented applications like agriculture and gardening, but it is incremental as it extends existing multimodal dataset approaches to a new domain.
The authors tackled the lack of multimodal datasets for unstructured natural scenes by creating EDEN, a synthetic dataset with over 300K images from 100+ garden models, annotated with multiple vision modalities. Experimental results show that pre-training deep networks on EDEN improves performance on semantic segmentation and monocular depth prediction for such scenes.
Multimodal large-scale datasets for outdoor scenes are mostly designed for urban driving problems. The scenes are highly structured and semantically different from scenarios seen in nature-centered scenes such as gardens or parks. To promote machine learning methods for nature-oriented applications, such as agriculture and gardening, we propose the multimodal synthetic dataset for Enclosed garDEN scenes (EDEN). The dataset features more than 300K images captured from more than 100 garden models. Each image is annotated with various low/high-level vision modalities, including semantic segmentation, depth, surface normals, intrinsic colors, and optical flow. Experimental results on the state-of-the-art methods for semantic segmentation and monocular depth prediction, two important tasks in computer vision, show positive impact of pre-training deep networks on our dataset for unstructured natural scenes. The dataset and related materials will be available at https://lhoangan.github.io/eden.