CRCVLGIVSep 29, 2022

Dataset Distillation for Medical Dataset Sharing

arXiv:2209.14603v437 citationsh-index: 26
Originality Synthesis-oriented
AI Analysis

This addresses privacy and cost problems for hospitals in medical data sharing, but it is incremental as it applies an existing dataset distillation technique to a new domain.

The paper tackles the challenge of sharing medical datasets between hospitals due to privacy and cost issues by proposing a dataset distillation method that synthesizes a small dataset, achieving high detection performance on a COVID-19 chest X-ray image dataset with scarce anonymized images.

Sharing medical datasets between hospitals is challenging because of the privacy-protection problem and the massive cost of transmitting and storing many high-resolution medical images. However, dataset distillation can synthesize a small dataset such that models trained on it achieve comparable performance with the original large dataset, which shows potential for solving the existing medical sharing problems. Hence, this paper proposes a novel dataset distillation-based method for medical dataset sharing. Experimental results on a COVID-19 chest X-ray image dataset show that our method can achieve high detection performance even using scarce anonymized chest X-ray images.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes