CV AI LGDec 15, 2023

Latent Diffusion Models with Image-Derived Annotations for Enhanced AI-Assisted Cancer Diagnosis in Histopathology

Pedro Osorio, Guillermo Jimenez-Perez, Javier Montalt-Tordera, Jens Hooge, Guillem Duran-Ballester, Shivam Singh, Moritz Radbruch, Ute Bach, Sabrina Schroeder, Krystyna Siudak, Julia Vienenkoetter, Bettina Lawrenz

arXiv:2312.09792v15.017 citationsh-index: 6Diagnostics

Originality Incremental advance

AI Analysis

This work addresses the need for large annotated datasets in histopathology by enhancing synthetic data generation, though it is incremental as it builds on existing latent diffusion models.

The paper tackled the problem of generating synthetic histopathology images for AI-assisted cancer diagnosis by proposing a method that uses image-derived features to create prompts for latent diffusion models, improving FID from 178.8 to 90.2 and showing that synthetic data can effectively train AI models.

Artificial Intelligence (AI) based image analysis has an immense potential to support diagnostic histopathology, including cancer diagnostics. However, developing supervised AI methods requires large-scale annotated datasets. A potentially powerful solution is to augment training data with synthetic data. Latent diffusion models, which can generate high-quality, diverse synthetic images, are promising. However, the most common implementations rely on detailed textual descriptions, which are not generally available in this domain. This work proposes a method that constructs structured textual prompts from automatically extracted image features. We experiment with the PCam dataset, composed of tissue patches only loosely annotated as healthy or cancerous. We show that including image-derived features in the prompt, as opposed to only healthy and cancerous labels, improves the Fréchet Inception Distance (FID) from 178.8 to 90.2. We also show that pathologists find it challenging to detect synthetic images, with a median sensitivity/specificity of 0.55/0.55. Finally, we show that synthetic data effectively trains AI models.

View on arXiv PDF

Similar