CVSep 19, 2022
On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural NetworksHubert Leterme, Kévin Polisano, Valérie Perrier et al.
This paper focuses on improving the mathematical interpretability of convolutional neural networks (CNNs) in the context of image classification. Specifically, we tackle the instability issue arising in their first layer, which tends to learn parameters that closely resemble oriented band-pass filters when trained on datasets like ImageNet. Subsampled convolutions with such Gabor-like filters are prone to aliasing, causing sensitivity to small input shifts. In this context, we establish conditions under which the max pooling operator approximates a complex modulus, which is nearly shift invariant. We then derive a measure of shift invariance for subsampled convolutions followed by max pooling. In particular, we highlight the crucial role played by the filter's frequency and orientation in achieving stability. We experimentally validate our theory by considering a deterministic feature extractor based on the dual-tree complex wavelet packet transform, a particular case of discrete Gabor-like decomposition.
69.9COMar 23
A plug-and-play approach with fast uncertainty quantification for weak lensing mass mappingHubert Leterme, Andreas Tersenov, Jalal Fadili et al.
Upcoming stage-IV surveys such as Euclid and Rubin will deliver vast amounts of high-precision data, opening new opportunities to constrain cosmological models with unprecedented accuracy. A key step in this process is the reconstruction of the dark matter distribution from noisy weak lensing shear measurements. Current deep learning-based mass mapping methods achieve high reconstruction accuracy, but either require retraining a model for each new observed sky region (limiting practicality) or rely on slow MCMC sampling. Efficient exploitation of future survey data therefore calls for a new method that is accurate, flexible, and fast at inference. In addition, uncertainty quantification with coverage guarantees is essential for reliable cosmological parameter estimation. We introduce PnPMass, a plug-and-play approach for weak lensing mass mapping. The algorithm produces point estimates by alternating between a gradient descent step with a carefully chosen data fidelity term, and a denoising step implemented with a single deep learning model trained on simulated data corrupted by Gaussian white noise. We also propose a fast, sampling-free uncertainty quantification scheme based on moment networks, with calibrated error bars obtained through conformal prediction to ensure coverage guarantees. Finally, we benchmark PnPMass against both model-driven and data-driven mass mapping techniques. PnPMass achieves performance close to that of state-of-the-art deep-learning methods while offering fast inference (converging in just a few iterations) and requiring only a single training phase, independently of the noise covariance of the observations. It therefore combines flexibility, efficiency, and reconstruction accuracy, while delivering tighter error bars than existing approaches, making it well suited for upcoming weak lensing surveys.
CVDec 1, 2022
From CNNs to Shift-Invariant Twin Models Based on Complex WaveletsHubert Leterme, Kévin Polisano, Valérie Perrier et al.
We propose a novel method to increase shift invariance and prediction accuracy in convolutional neural networks. Specifically, we replace the first-layer combination "real-valued convolutions + max pooling" (RMax) by "complex-valued convolutions + modulus" (CMod), which is stable to translations, or shifts. To justify our approach, we claim that CMod and RMax produce comparable outputs when the convolution kernel is band-pass and oriented (Gabor-like filter). In this context, CMod can therefore be considered as a stable alternative to RMax. To enforce this property, we constrain the convolution kernels to adopt such a Gabor-like structure. The corresponding architecture is called mathematical twin, because it employs a well-defined mathematical operator to mimic the behavior of the original, freely-trained model. Our approach achieves superior accuracy on ImageNet and CIFAR-10 classification tasks, compared to prior methods based on low-pass filtering. Arguably, our approach's emphasis on retaining high-frequency details contributes to a better balance between shift invariance and information preservation, resulting in improved performance. Furthermore, it has a lower computational cost and memory footprint than concurrent work, making it a promising solution for practical implementation.
CVMar 19, 2025
Disentangling Modes and Interference in the Spectrogram of Multicomponent SignalsKévin Polisano, Sylvain Meignen, Nils Laurent et al.
In this paper, we investigate how the spectrogram of multicomponent signals can be decomposed into a mode part and an interference part. We explore two approaches: (i) a variational method inspired by texture-geometry decomposition in image processing, and (ii) a supervised learning approach using a U-Net architecture, trained on a dataset encompassing diverse interference patterns and noise conditions. Once the interference component is identified, we explain how it enables us to define a criterion to locally adapt the window length used in the definition of the spectrogram, for the sake of improving ridge detection in the presence of close modes. Numerical experiments illustrate the advantages and limitations of both approaches for spectrogram decomposition, highlighting their potential for enhancing time-frequency analysis in the presence of strong interference.