Shift-Equivariant Complex-Valued Convolutional Neural Networks
This work addresses shift-equivariant processing in complex-valued data for computer vision, particularly in radar imaging, but is incremental as it builds on prior real-valued methods.
The paper tackled the lack of shift equivariance and invariance in convolutional neural networks by extending Learnable Polyphase Sampling to complex-valued networks with a new projection layer, achieving improved performance on classification, reconstruction, and segmentation tasks using polarimetric SAR images.
Convolutional neural networks have shown remarkable performance in recent years on various computer vision problems. However, the traditional convolutional neural network architecture lacks a critical property: shift equivariance and invariance, broken by downsampling and upsampling operations. Although data augmentation techniques can help the model learn the latter property empirically, a consistent and systematic way to achieve this goal is by designing downsampling and upsampling layers that theoretically guarantee these properties by construction. Adaptive Polyphase Sampling (APS) introduced the cornerstone for shift invariance, later extended to shift equivariance with Learnable Polyphase up/downsampling (LPS) applied to real-valued neural networks. In this paper, we extend the work on LPS to complex-valued neural networks both from a theoretical perspective and with a novel building block of a projection layer from $\mathbb{C}$ to $\mathbb{R}$ before the Gumbel Softmax. We finally evaluate this extension on several computer vision problems, specifically for either the invariance property in classification tasks or the equivariance property in both reconstruction and semantic segmentation problems, using polarimetric Synthetic Aperture Radar images.