CVFeb 26, 2024Code
Efficient 3D affinely equivariant CNNs with adaptive fusion of augmented spherical Fourier-Bessel basesWenzhao Zhao, Steffen Albert, Barbara D. Wichtmann et al.
Filter-decomposition-based group equivariant convolutional neural networks (CNNs) have shown promising stability and data efficiency for 3D image feature extraction. However, these networks, which rely on parameter sharing and discrete transformation groups, often underperform in modern deep neural network architectures for processing volumetric images, such as the common 3D medical images. To address these limitations, this paper presents an efficient non-parameter-sharing continuous 3D affine group equivariant neural network for volumetric images. This network uses an adaptive aggregation of Monte Carlo augmented spherical Fourier-Bessel filter bases to improve the efficiency and flexibility of 3D group equivariant CNNs for volumetric data. Unlike existing methods that focus only on angular orthogonality in filter bases, the introduced spherical Bessel Fourier filter base incorporates both angular and radial orthogonality to improve feature extraction. Experiments on four medical image segmentation datasets show that the proposed methods achieve better affine group equivariance and superior segmentation accuracy than existing 3D group equivariant convolutional neural network layers, significantly improving the training stability and data efficiency of conventional CNN layers (at 0.05 significance level). The code is available at https://github.com/ZhaoWenzhao/WMCSFB.
CVMay 17, 2023Code
Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural networkWenzhao Zhao, Barbara D. Wichtmann, Steffen Albert et al.
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of stochastically augmented decomposed filters. We give theoretical proof about how the group equivariance can be achieved by our methods. Our method applies to both continuous and discrete groups, where the augmentation is implemented using Monte Carlo sampling and bootstrap resampling, respectively. Our methods also serve as an efficient extension of standard CNN. The experiments show that our method outperforms parameter-sharing group equivariant networks and enhances the performance of standard CNNs in image classification and denoising tasks, by using suitable filter bases to build efficient lightweight networks. The code will be available at https://github.com/ZhaoWenzhao/MCG_CNN.
IVApr 12, 2025
seg2med: a bridge from artificial anatomy to multimodal medical imagesZeyu Yang, Zhilin Chen, Yipeng Sun et al.
We present seg2med, a modular framework for anatomy-driven multimodal medical image synthesis. The system integrates three components to enable high-fidelity, cross-modality generation of CT and MR images based on structured anatomical priors. First, anatomical maps are independently derived from three sources: real patient data, XCAT digital phantoms, and synthetic anatomies created by combining organs from multiple patients. Second, we introduce PhysioSynth, a modality-specific simulator that converts anatomical masks into prior volumes using tissue-dependent parameters (e.g., HU, T1, T2, proton density) and modality-specific signal models. It supports simulation of CT and multiple MR sequences including GRE, SPACE, and VIBE. Third, the synthesized anatomical priors are used to train 2-channel conditional denoising diffusion models, which take the anatomical prior as structural condition alongside the noisy image, enabling generation of high-quality, structurally aligned images. The framework achieves SSIM of 0.94 for CT and 0.89 for MR compared to real data, and FSIM of 0.78 for simulated CT. The generative quality is further supported by a Frechet Inception Distance (FID) of 3.62 for CT synthesis. In modality conversion, seg2med achieves SSIM of 0.91 for MR to CT and 0.77 for CT to MR. Anatomical fidelity evaluation shows synthetic CT achieves mean Dice scores above 0.90 for 11 key abdominal organs, and above 0.80 for 34 of 59 total organs. These results underscore seg2med's utility in cross-modality synthesis, data augmentation, and anatomy-aware medical AI.