IVCVLGJan 31, 2022

AutoGeoLabel: Automated Label Generation for Geospatial Machine Learning

arXiv:2202.00067v112 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of data scarcity for researchers and practitioners in geospatial machine learning, though it appears incremental as it builds on existing rasterization and statistical methods.

The paper tackles the challenge of limited human-labeled data in supervised learning by proposing an automated label generation pipeline for remote sensing data, achieving class accuracies of approximately 0.9.

A key challenge of supervised learning is the availability of human-labeled data. We evaluate a big data processing pipeline to auto-generate labels for remote sensing data. It is based on rasterized statistical features extracted from surveys such as e.g. LiDAR measurements. Using simple combinations of the rasterized statistical layers, it is demonstrated that multiple classes can be generated at accuracies of ~0.9. As proof of concept, we utilize the big geo-data platform IBM PAIRS to dynamically generate such labels in dense urban areas with multiple land cover classes. The general method proposed here is platform independent, and it can be adapted to generate labels for other satellite modalities in order to enable machine learning on overhead imagery for land use classification and object detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes