A Novel Multi-scale Attention Feature Extraction Block for Aerial Remote Sensing Image Classification
This work addresses performance instability in aerial remote sensing image classification for remote sensing applications, representing an incremental improvement.
The paper tackles the problem of limited representation capability for complex and small objects in very high-resolution aerial remote sensing image classification by proposing a novel multi-scale attention feature extraction block, achieving stable performance with a minimum standard deviation of 0.002 and overall classification accuracies of 95.85% on AID and 94.09% on NWPU datasets.
Classification of very high-resolution (VHR) aerial remote sensing (RS) images is a well-established research area in the remote sensing community as it provides valuable spatial information for decision-making. Existing works on VHR aerial RS image classification produce an excellent classification performance; nevertheless, they have a limited capability to well-represent VHR RS images having complex and small objects, thereby leading to performance instability. As such, we propose a novel plug-and-play multi-scale attention feature extraction block (MSAFEB) based on multi-scale convolution at two levels with skip connection, producing discriminative/salient information at a deeper/finer level. The experimental study on two benchmark VHR aerial RS image datasets (AID and NWPU) demonstrates that our proposal achieves a stable/consistent performance (minimum standard deviation of $0.002$) and competent overall classification performance (AID: 95.85\% and NWPU: 94.09\%).