72.8CVMar 29
Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model DevelopmentZhongying Deng, Cheng Tang, Ziyan Huang et al. · pku
Foundation models have demonstrated remarkable success across diverse domains and tasks, primarily due to the thrive of large-scale, diverse, and high-quality datasets. However, in the field of medical imaging, the curation and assembling of such medical datasets are highly challenging due to the reliance on clinical expertise and strict ethical and privacy constraints, resulting in a scarcity of large-scale unified medical datasets and hindering the development of powerful medical foundation models. In this work, we present the largest survey to date of medical image datasets, covering over 1,000 open-access datasets with a systematic catalog of their modalities, tasks, anatomies, annotations, limitations, and potential for integration. Our analysis exposes a landscape that is modest in scale, fragmented across narrowly scoped tasks, and unevenly distributed across organs and modalities, which in turn limits the utility of existing medical image datasets for developing versatile and robust medical foundation models. To turn fragmentation into scale, we propose a metadata-driven fusion paradigm (MDFP) that integrates public datasets with shared modalities or tasks, thereby transforming multiple small data silos into larger, more coherent resources. Building on MDFP, we release an interactive discovery portal that enables end-to-end, automated medical image dataset integration, and compile all surveyed datasets into a unified, structured table that clearly summarizes their key characteristics and provides reference links, offering the community an accessible and comprehensive repository. By charting the current terrain and offering a principled path to dataset consolidation, our survey provides a practical roadmap for scaling medical imaging corpora, supporting faster data discovery, more principled dataset creation, and more capable medical foundation models.
LGJul 14, 2023
Inverse Evolution Layers: Physics-informed Regularizers for Deep Neural NetworksChaoyu Liu, Zhonghua Qiao, Chao Li et al.
Traditional image processing methods employing partial differential equations (PDEs) offer a multitude of meaningful regularizers, along with valuable theoretical foundations for a wide range of image-related tasks. This makes their integration into neural networks a promising avenue. In this paper, we introduce a novel regularization approach inspired by the reverse process of PDE-based evolution models. Specifically, we propose inverse evolution layers (IELs), which serve as bad property amplifiers to penalize neural networks of which outputs have undesired characteristics. Using IELs, one can achieve specific regularization objectives and endow neural networks' outputs with corresponding properties of the PDE models. Our experiments, focusing on semantic segmentation tasks using heat-diffusion IELs, demonstrate their effectiveness in mitigating noisy label effects. Additionally, we develop curve-motion IELs to enforce convex shape regularization in neural network-based segmentation models for preventing the generation of concave outputs. Theoretical analysis confirms the efficacy of IELs as an effective regularization mechanism, particularly in handling training with label issues.
CVMar 17, 2022
An Active Contour Model with Local Variance Force Term and Its Efficient Minimization Solver for Multi-phase Image SegmentationChaoyu Liu, Zhonghua Qiao, Qian Zhang
In this paper, we propose an active contour model with a local variance force (LVF) term that can be applied to multi-phase image segmentation problems. With the LVF, the proposed model is very effective in the segmentation of images with noise. To solve this model efficiently, we represent the regularization term by characteristic functions and then design a minimization algorithm based on a modification of the iterative convolution-thresholding method (ICTM), namely ICTM-LVF. This minimization algorithm enjoys the energy-decaying property under some conditions and has highly efficient performance in the segmentation. To overcome the initialization issue of active contour models, we generalize the inhomogeneous graph Laplacian initialization method (IGLIM) to the multi-phase case and then apply it to give the initial contour of the ICTM-LVF solver. Numerical experiments are conducted on synthetic images and real images to demonstrate the capability of our initialization method, and the effectiveness of the local variance force for noise robustness in the multi-phase image segmentation.
77.6LGMay 13
Di-BiLPS: Denoising induced Bidirectional Latent-PDE-Solver under Sparse ObservationsZhonghao Li, Chaoyu Liu, Qian Zhang
Partial differential equations (PDEs) are fundamental for modeling complex natural and physical phenomena. In many real-world applications, however, observational data are extremely sparse, which severely limits the applicability of both classical numerical solvers and existing neural approaches. While neural methods have shown promising results under moderately sparse observations, their inference efficiency at high resolutions is limited, and their accuracy degrades substantially in the extremely sparse regime. In this work, we propose the Di-BiLPS, a unified neural framework that effectively handle both forward and inverse PDE problems under extremely sparse observations. Di-BiLPS combines a variational autoencoder to compress high-dimensional inputs into a compact latent space, a latent diffusion module to model uncertainty, and contrastive learning to align representations. Operating entirely in this latent space, the framework achieves efficient inference while retaining flexible input-output mapping. In addition, we introduce a PDE-informed denoising algorithm based on a variance-preserving diffusion process, which further improves inference efficiency. Extensive experiments on multiple PDE benchmarks demonstrate that Di-BiLPS consistently achieves SOTA performance under extremely sparse inputs (as low as 3%), while substantially reducing computational cost. Moreover, Di-BiLPS enables zero-shot super-resolution, as it allows predictions over continuous spatial-temporal domains.
87.1CEMay 7
Adaptive Coordinate Transforms for Neural OperatorsChaoyu Liu, Zhonghao Li, Gaohang Chen et al.
Neural operators have achieved promising performance on partial differential equations (PDEs), but most existing models are built on fixed Eulerian coordinates. This mismatch between evolving physical structures and static coordinates creates spatial misalignment, leading to unnecessarily non-local operator mappings and reinforcing a smoothness preference near sharp transitions. Inspired by adaptive coordinate transformations in classical PDE analysis, we propose the Adaptive Coordinate Transform (ACT) block, a plug-and-play module for data-driven geometric adaptation in neural operators. ACT blocks resolve this structural limitation by learning adaptive coordinate systems within the operator learning pipeline. Specifically, given an input feature, the ACT block learns a coordinate transformation and represents the same feature under the transformed coordinates via differentiable sampling. This operation preserves the underlying signal while changing its spatial representation, equivalent to expressing the same physical quantity in different coordinate systems. By adapting the coordinate system to the data, ACT allows the network to better track evolving structures, reduce operator complexity, and dynamically focus on critical features to improve learning. We evaluate the proposed approach across diverse PDE benchmarks and multiple neural operator architectures. Experimental results demonstrate consistent and significant improvements in predictive accuracy, indicating that learning coordinate systems provides a powerful mechanism for enhancing operator learning.
LGMar 22, 2025
Enhancing Fourier Neural Operators with Local Spatial FeaturesChaoyu Liu, Davide Murari, Lihao Liu et al.
Partial Differential Equation (PDE) problems often exhibit strong local spatial structures, and effectively capturing these structures is critical for approximating their solutions. Recently, the Fourier Neural Operator (FNO) has emerged as an efficient approach for solving these PDE problems. By using parametrization in the frequency domain, FNOs can efficiently capture global patterns. However, this approach inherently overlooks the critical role of local spatial features, as frequency-domain parameterized convolutions primarily emphasize global interactions without encoding comprehensive localized spatial dependencies. Although several studies have attempted to address this limitation, their extracted Local Spatial Features (LSFs) remain insufficient, and computational efficiency is often compromised. To address this limitation, we introduce a convolutional neural network (CNN)-based feature pre-extractor to capture LSFs directly from input data, resulting in a hybrid architecture termed \textit{Conv-FNO}. Furthermore, we introduce two novel resizing schemes to make our Conv-FNO resolution invariant. In this work, we focus on demonstrating the effectiveness of incorporating LSFs into FNOs by conducting both a theoretical analysis and extensive numerical experiments. Our findings show that this simple yet impactful modification enhances the representational capacity of FNOs and significantly improves performance on challenging PDE benchmarks.
LGMay 30, 2025
Conservation-preserved Fourier Neural Operator through Adaptive CorrectionChaoyu Liu, Yangming Li, Zhongying Deng et al.
Fourier Neural Operators (FNOs) have recently emerged as a promising and efficient approach for learning the numerical solutions to partial differential equations (PDEs) from data. However, standard FNO often fails to preserve key conservation laws, such as mass conservation, momentum conservation, norm conservation, etc., which are crucial for accurately modeling physical systems. Existing methods for incorporating these conservation laws into Fourier neural operators are achieved by designing related loss function or incorporating post-processing method at the training time. None of them can both exactly and adaptively correct the outputs to satisfy conservation laws, and our experiments show that these methods can lead to inferior performance while preserving conservation laws. In this work, we propose a novel adaptive correction approach to ensure the conservation of fundamental quantities. Our method introduces a learnable matrix to adaptively adjust the solution to satisfy the conservation law during training. It ensures that the outputs exactly satisfy the goal conservation law and allow for more flexibility and adaptivity for the model to correct the outputs. We theoretically show that applying our adaptive correction to an unconstrained FNO yields a solution with data loss no worse than that of the best conservation-satisfying FNO. We compare our approach with existing methods on a range of representative PDEs. Experiment results show that our method consistently outperform other methods.
LGJan 24, 2025
Inverse Evolution Data Augmentation for Neural PDE SolversChaoyu Liu, Chris Budd, Carola-Bibiane Schönlieb
Neural networks have emerged as promising tools for solving partial differential equations (PDEs), particularly through the application of neural operators. Training neural operators typically requires a large amount of training data to ensure accuracy and generalization. In this paper, we propose a novel data augmentation method specifically designed for training neural operators on evolution equations. Our approach utilizes insights from inverse processes of these equations to efficiently generate data from random initialization that are combined with original data. To further enhance the accuracy of the augmented data, we introduce high-order inverse evolution schemes. These schemes consist of only a few explicit computation steps, yet the resulting data pairs can be proven to satisfy the corresponding implicit numerical schemes. In contrast to traditional PDE solvers that require small time steps or implicit schemes to guarantee accuracy, our data augmentation method employs explicit schemes with relatively large time steps, thereby significantly reducing computational costs. Accuracy and efficacy experiments confirm the effectiveness of our approach. Additionally, we validate our approach through experiments with the Fourier Neural Operator and UNet on three common evolution equations that are Burgers' equation, the Allen-Cahn equation and the Navier-Stokes equation. The results demonstrate a significant improvement in the performance and robustness of the Fourier Neural Operator when coupled with our inverse evolution data augmentation method.
CVSep 5, 2025
GeoSplat: A Deep Dive into Geometry-Constrained Gaussian SplattingYangming Li, Chaoyu Liu, Lihao Liu et al.
A few recent works explored incorporating geometric priors to regularize the optimization of Gaussian splatting, further improving its performance. However, those early studies mainly focused on the use of low-order geometric priors (e.g., normal vector), and they might also be unreliably estimated by noise-sensitive methods, like local principal component analysis. To address their limitations, we first present GeoSplat, a general geometry-constrained optimization framework that exploits both first-order and second-order geometric quantities to improve the entire training pipeline of Gaussian splatting, including Gaussian initialization, gradient update, and densification. As an example, we initialize the scales of 3D Gaussian primitives in terms of principal curvatures, leading to a better coverage of the object surface than random initialization. Secondly, based on certain geometric structures (e.g., local manifold), we introduce efficient and noise-robust estimation methods that provide dynamic geometric priors for our framework. We conduct extensive experiments on multiple datasets for novel view synthesis, showing that our framework, GeoSplat, significantly improves the performance of Gaussian splatting and outperforms previous baselines.
CVAug 27, 2025
IELDG: Suppressing Domain-Specific Noise with Inverse Evolution Layers for Domain Generalized Semantic SegmentationQizhe Fan, Chaoyu Liu, Zhonghua Qiao et al.
Domain Generalized Semantic Segmentation (DGSS) focuses on training a model using labeled data from a source domain, with the goal of achieving robust generalization to unseen target domains during inference. A common approach to improve generalization is to augment the source domain with synthetic data generated by diffusion models (DMs). However, the generated images often contain structural or semantic defects due to training imperfections. Training segmentation models with such flawed data can lead to performance degradation and error accumulation. To address this issue, we propose to integrate inverse evolution layers (IELs) into the generative process. IELs are designed to highlight spatial discontinuities and semantic inconsistencies using Laplacian-based priors, enabling more effective filtering of undesirable generative patterns. Based on this mechanism, we introduce IELDM, an enhanced diffusion-based data augmentation framework that can produce higher-quality images. Furthermore, we observe that the defect-suppression capability of IELs can also benefit the segmentation network by suppressing artifact propagation. Based on this insight, we embed IELs into the decoder of the DGSS model and propose IELFormer to strengthen generalization capability in cross-domain scenarios. To further strengthen the model's semantic consistency across scales, IELFormer incorporates a multi-scale frequency fusion (MFF) module, which performs frequency-domain analysis to achieve structured integration of multi-resolution features, thereby improving cross-scale coherence. Extensive experiments on benchmark datasets demonstrate that our approach achieves superior generalization performance compared to existing methods.
CVMay 1, 2025
Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease AnalysisZhongying Deng, Haoyu Wang, Ziyan Huang et al.
Brain diseases, such as Alzheimer's disease and brain tumors, present profound challenges due to their complexity and societal impact. Recent advancements in brain foundation models have shown significant promise in addressing a range of brain-related tasks. However, current brain foundation models are limited by task and data homogeneity, restricted generalization beyond segmentation or classification, and inefficient adaptation to diverse clinical tasks. In this work, we propose SAM-Brain3D, a brain-specific foundation model trained on over 66,000 brain image-label pairs across 14 MRI sub-modalities, and Hypergraph Dynamic Adapter (HyDA), a lightweight adapter for efficient and effective downstream adaptation. SAM-Brain3D captures detailed brain-specific anatomical and modality priors for segmenting diverse brain targets and broader downstream tasks. HyDA leverages hypergraphs to fuse complementary multi-modal data and dynamically generate patient-specific convolutional kernels for multi-scale feature fusion and personalized patient-wise adaptation. Together, our framework excels across a broad spectrum of brain disease segmentation and classification tasks. Extensive experiments demonstrate that our method consistently outperforms existing state-of-the-art approaches, offering a new paradigm for brain disease analysis through multi-modal, multi-scale, and dynamic foundation modeling.
LGJan 29, 2025
Generative Unordered Flow for Set-Structured Data GenerationYangming Li, Chaoyu Liu, Carola-Bibiane Schönlieb
Flow-based generative models have demonstrated promising performance across a broad spectrum of data modalities (e.g., image and text). However, there are few works exploring their extension to unordered data (e.g., spatial point set), which is not trivial because previous models are mostly designed for vector data that are naturally ordered. In this paper, we present unordered flow, a type of flow-based generative model for set-structured data generation. Specifically, we convert unordered data into an appropriate function representation, and learn the probability measure of such representations through function-valued flow matching. For the inverse map from a function representation to unordered data, we propose a method similar to particle filtering, with Langevin dynamics to first warm-up the initial particles and gradient-based search to update them until convergence. We have conducted extensive experiments on multiple real-world datasets, showing that our unordered flow model is very effective in generating set-structured data and significantly outperforms previous baselines.