Modulation Feature Enhancement with a Multi-Stage Attention Network for Underwater Acoustic Target Recognition
For researchers in underwater acoustic target recognition, this work addresses challenges of complex noise and class imbalance with a novel attention-based framework.
The paper proposes a deep learning framework for underwater acoustic target recognition that uses VMD and 3/2-D spectrum for feature extraction, a multi-stage attention network (MMATT) with novel attention mechanisms, and an adjustable class-balanced focal loss to handle class imbalance. On a real-world dataset, the method improves recognition performance.
Underwater acoustic target recognition is critical for maritime applications, yet it faces challenges arising from the complex and diverse nature of ship-radiated noise. To address these issues, we propose a robust deep learning-based framework. First, we introduce a feature extraction and fusion method based on variational mode decomposition (VMD) and the 3/2-D spectrum to generate high-fidelity 2-D DEMON spectral features, which effectively capture modulation envelope information. To further enhance feature representation, we design a one-dimensional convolutional neural network (1-D CNN) integrated with a novel Multi-Stage Multi-Type Attention Mechanism (MMATT) that adaptively refines features at different network depths. Within this mechanism, we propose a Residual Channel-Independent Spectral Attention Mechanism (R-CISAM) and a Multi-Scale Separate-and-Fuse Spectral Attention Mechanism (MS-SFSAM). Moreover, to mitigate performance degradation caused by severe class imbalance inherent in real-world ship-radiated noise data, we devise an Adjustable Class-Balanced Focal Loss (ACBFL), which provides flexibility across tasks with varying degrees of imbalance. Experimental results on a real-world ship-radiated noise dataset demonstrate that the proposed solutions effectively enhance underwater acoustic target recognition performance.