h-index27
4papers
15citations
Novelty44%
AI Score43

4 Papers

CVFeb 10Code
MieDB-100k: A Comprehensive Dataset for Medical Image Editing

Yongfan Lai, Wen Qian, Bo Liu et al.

The scarcity of high-quality data remains a primary bottleneck in adapting multimodal generative models for medical image editing. Existing medical image editing datasets often suffer from limited diversity, neglect of medical image understanding and inability to balance quality with scalability. To address these gaps, we propose MieDB-100k, a large-scale, high-quality and diverse dataset for text-guided medical image editing. It categorizes editing tasks into perspectives of Perception, Modification and Transformation, considering both understanding and generation abilities. We construct MieDB-100k via a data curation pipeline leveraging both modality-specific expert models and rule-based data synthetic methods, followed by rigorous manual inspection to ensure clinical fidelity. Extensive experiments demonstrate that model trained with MieDB-100k consistently outperform both open-source and proprietary models while exhibiting strong generalization ability. We anticipate that this dataset will serve as a cornerstone for future advancements in specialized medical image editing.

LGJan 10, 2025
DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information

Yongfan Lai, Jiabo Chen, Deyun Zhang et al.

Heart disease remains a significant threat to human health. As a non-invasive diagnostic tool, the electrocardiogram (ECG) is one of the most widely used methods for cardiac screening. However, the scarcity of high-quality ECG data, driven by privacy concerns and limited medical resources, creates a pressing need for effective ECG signal generation. Existing approaches for generating ECG signals typically rely on small training datasets, lack comprehensive evaluation frameworks, and overlook potential applications beyond data augmentation. To address these challenges, we propose DiffuSETS, a novel framework capable of generating ECG signals with high semantic alignment and fidelity. DiffuSETS accepts various modalities of clinical text reports and patient-specific information as inputs, enabling the creation of clinically meaningful ECG signals. Additionally, to address the lack of standardized evaluation in ECG generation, we introduce a comprehensive benchmarking methodology to assess the effectiveness of generative models in this domain. Our model achieve excellent results in tests, proving its superiority in the task of ECG generation. Furthermore, we showcase its potential to mitigate data scarcity while exploring novel applications in cardiology education and medical knowledge discovery, highlighting the broader impact of our work.

LGOct 13, 2025
Reconstructing 12-Lead ECG from 3-Lead ECG using Variational Autoencoder to Improve Cardiac Disease Detection of Wearable ECG Devices

Xinyan Guan, Yongfan Lai, Jiarui Jin et al.

Twelve-lead electrocardiograms (ECGs) are the clinical gold standard for cardiac diagnosis, providing comprehensive spatial coverage of the heart necessary to detect conditions such as myocardial infarction (MI). However, their lack of portability limits continuous and large-scale use. Three-lead ECG systems are widely used in wearable devices due to their simplicity and mobility, but they often fail to capture pathologies in unmeasured regions. To address this, we propose WearECG, a Variational Autoencoder (VAE) method that reconstructs twelve-lead ECGs from three leads: II, V1, and V5. Our model includes architectural improvements to better capture temporal and spatial dependencies in ECG signals. We evaluate generation quality using MSE, MAE, and Frechet Inception Distance (FID), and assess clinical validity via a Turing test with expert cardiologists. To further validate diagnostic utility, we fine-tune ECGFounder, a large-scale pretrained ECG model, on a multi-label classification task involving over 40 cardiac conditions, including six different myocardial infarction locations, using both real and generated signals. Experiments on the MIMIC dataset show that our method produces physiologically realistic and diagnostically informative signals, with robust performance in downstream tasks. This work demonstrates the potential of generative modeling for ECG reconstruction and its implications for scalable, low-cost cardiac screening.

LGAug 1, 2025
ECGTwin: Personalized ECG Generation Using Controllable Diffusion Model

Yongfan Lai, Bo Liu, Xinyan Guan et al.

Personalized electrocardiogram (ECG) generation is to simulate a patient's ECG digital twins tailored to specific conditions. It has the potential to transform traditional healthcare into a more accurate individualized paradigm, while preserving the key benefits of conventional population-level ECG synthesis. However, this promising task presents two fundamental challenges: extracting individual features without ground truth and injecting various types of conditions without confusing generative model. In this paper, we present ECGTwin, a two-stage framework designed to address these challenges. In the first stage, an Individual Base Extractor trained via contrastive learning robustly captures personal features from a reference ECG. In the second stage, the extracted individual features, along with a target cardiac condition, are integrated into the diffusion-based generation process through our novel AdaX Condition Injector, which injects these signals via two dedicated and specialized pathways. Both qualitative and quantitative experiments have demonstrated that our model can not only generate ECG signals of high fidelity and diversity by offering a fine-grained generation controllability, but also preserving individual-specific features. Furthermore, ECGTwin shows the potential to enhance ECG auto-diagnosis in downstream application, confirming the possibility of precise personalized healthcare solutions.