CVAILGSep 25, 2022

High-Resolution Satellite Imagery for Modeling the Impact of Aridification on Crop Production

arXiv:2209.12238v11 citationsh-index: 12
Originality Synthesis-oriented
AI Analysis

This addresses the problem of limited data availability for training ML models in agriculture, particularly for aridification impact modeling, but is incremental as it focuses on dataset creation and benchmarking.

The authors tackled the scarcity of curated, labeled datasets for remote sensing in agriculture by introducing SICKLE, a first-of-its-kind dataset with 2,398 season-wise samples annotated with key cropping parameters for paddy cultivation, and proposed a yield prediction strategy that improved performance by leveraging domain knowledge.

The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite the increased access to earth observation data for agriculture, there is a scarcity of curated, labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset, SICKLE, having time-series images at different spatial resolutions from 3 different satellites, annotated with multiple key cropping parameters for paddy cultivation for the Cauvery Delta region in Tamil Nadu, India. The dataset comprises of 2,398 season-wise samples from 388 unique plots distributed across 4 districts of the Delta. The dataset covers multi-spectral, thermal and microwave data between the time period January 2018-March 2021. The paddy samples are annotated with 4 key cropping parameters, i.e. sowing date, transplanting date, harvesting date and crop yield. This is one of the first studies to consider the growing season (using sowing and harvesting dates) as part of a dataset. We also propose a yield prediction strategy that uses time-series data generated based on the observed growing season and the standard seasonal information obtained from Tamil Nadu Agricultural University for the region. The consequent performance improvement highlights the impact of ML techniques that leverage domain knowledge that are consistent with standard practices followed by farmers in a specific region. We benchmark the dataset on 3 separate tasks, namely crop type, phenology date (sowing, transplanting, harvesting) and yield prediction, and develop an end-to-end framework for predicting key crop parameters in a real-world setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes