CVLGJan 8, 2025

DeFusion: An Effective Decoupling Fusion Network for Multi-Modal Pregnancy Prediction

arXiv:2501.04353v23 citationsh-index: 6Has Code
AI Analysis

This work addresses pregnancy prediction for IVF-ET patients, offering a novel multi-modal fusion approach that is incremental in improving accuracy for this specific medical application.

The paper tackled the problem of improving pregnancy prediction in IVF-ET by integrating temporal embryo images and parental fertility table indicators, proposing DeFusion, a decoupling fusion network that outperformed state-of-the-art methods on a dataset of 4046 cases.

Temporal embryo images and parental fertility table indicators are both valuable for pregnancy prediction in \textbf{in vitro fertilization embryo transfer} (IVF-ET). However, current machine learning models cannot make full use of the complementary information between the two modalities to improve pregnancy prediction performance. In this paper, we propose a Decoupling Fusion Network called DeFusion to effectively integrate the multi-modal information for IVF-ET pregnancy prediction. Specifically, we propose a decoupling fusion module that decouples the information from the different modalities into related and unrelated information, thereby achieving a more delicate fusion. And we fuse temporal embryo images with a spatial-temporal position encoding, and extract fertility table indicator information with a table transformer. To evaluate the effectiveness of our model, we use a new dataset including 4046 cases collected from Southern Medical University. The experiments show that our model outperforms state-of-the-art methods. Meanwhile, the performance on the eye disease prediction dataset reflects the model's good generalization. Our code is available at https://github.com/Ou-Young-1999/DFNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes