Localized Feature Aggregation Module for Semantic Segmentation
This work addresses the need for efficient and accurate semantic segmentation in medical imaging domains, such as cell analysis and COVID-19 diagnosis, though it appears incremental as it builds on existing U-net architectures.
The paper tackles the problem of recovering positional information in semantic segmentation by proposing a Localized Feature Aggregation Module that emphasizes similarity between encoder and decoder feature maps, resulting in improved segmentation accuracy with lower computational cost compared to conventional methods like U-net and attention U-net, as confirmed on Drosophila cell and COVID-19 image datasets.
We propose a new information aggregation method which called Localized Feature Aggregation Module based on the similarity between the feature maps of an encoder and a decoder. The proposed method recovers positional information by emphasizing the similarity between decoder's feature maps with superior semantic information and encoder's feature maps with superior positional information. The proposed method can learn positional information more efficiently than conventional concatenation in the U-net and attention U-net. Additionally, the proposed method also uses localized attention range to reduce the computational cost. Two innovations contributed to improve the segmentation accuracy with lower computational cost. By experiments on the Drosophila cell image dataset and COVID-19 image dataset, we confirmed that our method outperformed conventional methods.