CVApr 28, 2021

DeepSatData: Building large scale datasets of satellite images for training machine learning models

arXiv:2104.13824v23 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need for scalable datasets in remote sensing and computer vision, but it is incremental as it builds on existing data sources and methods.

The authors tackled the problem of generating large-scale satellite imagery datasets for training machine learning models, focusing on dense classification tasks like semantic segmentation, by using freely available Sentinel-2 data and providing accompanying code.

This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models with emphasis placed on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows generation of large scale datasets required for training deep neural networks. We discuss issues faced from the point of view of deep neural network training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach. Accompanying code is provided in https://github.com/michaeltrs/DeepSatData.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes