CVNov 12, 2021

The self-supervised spectral-spatial attention-based transformer network for automated, accurate prediction of crop nitrogen status from UAV imagery

arXiv:2111.06839v2
Originality Incremental advance
AI Analysis

This addresses the need for non-destructive, high-resolution nitrogen status estimation in agriculture to improve economic and environmental sustainability, though it is incremental as it builds on existing deep learning and remote sensing methods.

The paper tackled the problem of accurately predicting crop nitrogen status from UAV imagery to reduce over-application of fertilizer, achieving high accuracy (0.96) with good generalizability and reproducibility.

Nitrogen (N) fertilizer is routinely applied by farmers to increase crop yields. At present, farmers often over-apply N fertilizer in some locations or at certain times because they do not have high-resolution crop N status data. N-use efficiency can be low, with the remaining N lost to the environment, resulting in higher production costs and environmental pollution. Accurate and timely estimation of N status in crops is crucial to improving cropping systems' economic and environmental sustainability. Destructive approaches based on plant tissue analysis are time consuming and impractical over large fields. Recent advances in remote sensing and deep learning have shown promise in addressing the aforementioned challenges in a non-destructive way. In this work, we propose a novel deep learning framework: a self-supervised spectral-spatial attention-based vision transformer (SSVT). The proposed SSVT introduces a Spectral Attention Block (SAB) and a Spatial Interaction Block (SIB), which allows for simultaneous learning of both spatial and spectral features from UAV digital aerial imagery, for accurate N status prediction in wheat fields. Moreover, the proposed framework introduces local-to-global self-supervised learning to help train the model from unlabelled data. The proposed SSVT has been compared with five state-of-the-art models including: ResNet, RegNet, EfficientNet, EfficientNetV2 and the original vision transformer on both testing and independent datasets. The proposed approach achieved high accuracy (0.96) with good generalizability and reproducibility for wheat N status estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes