GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks
This work proposes a new generative paradigm for medical AI that addresses generalization across heterogeneous data and modalities, offering a reusable, task-agnostic framework.
GenMed reformulates medical diagnostic tasks as generative joint modeling of inputs and outputs using diffusion models, enabling flexible inference without retraining. It achieves strong performance across diverse tasks including cross-modality segmentation, few-shot (2-4 samples), and zero-shot settings.
Data-driven medical AI is traditionally formulated as a discriminative mapping from input $X$ to output $Y$ via a learned function $f$, which does not generalize well across heterogeneous data and modalities encountered in real-world clinical settings. In this work, we propose a fundamentally different, generative paradigm. We model the joint distribution $P(X,Y)$ using diffusion models and reframe inference as a test-time output optimization problem. By guiding the generative process to match observed inputs, our framework enables flexible, gradient-based conditioning at inference time without architectural changes or retraining, effectively supporting arbitrary and previously unseen combinations of observations. Extensive experiments demonstrate strong performance across standard and cross-modality medical image segmentation, few-shot segmentation with only 2 or 4 training samples, degraded-input segmentation, shape completion from sparse and partial observations, and zero-shot application to demonstrate generality. To support these evaluations, we curated and released a large-scale text-shape dataset derived from MedShapeNet. Our results highlight the versatility of generative joint modeling as a foundation for reusable, task-agnostic medical AI systems.