CV AIMay 2, 2024

Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning

Rafael Elberg, Denis Parra, Mircea Petrache

arXiv:2405.01705v17.65 citationsh-index: 2Has CodeLatinX in AI at Computer Vision and Pattern Recognition Conference 2024

Originality Incremental advance

AI Analysis

This work addresses data scarcity and privacy issues in medical imaging by enhancing image generation for long-tailed distributions, though it is incremental as it builds on existing diffusion models.

The paper tackles the challenge of generating images for under-represented classes in long-tailed datasets, particularly in the medical domain, by proposing a method that leverages the latent space of pre-trained Stable Diffusion models to mix head and tail class examples, resulting in improved image generation for tail classes.

Image and multimodal machine learning tasks are very challenging to solve in the case of poorly distributed data. In particular, data availability and privacy restrictions exacerbate these hurdles in the medical domain. The state of the art in image generation quality is held by Latent Diffusion models, making them prime candidates for tackling this problem. However, a few key issues still need to be solved, such as the difficulty in generating data from under-represented classes and a slow inference process. To mitigate these issues, we propose a new method for image augmentation in long-tailed data based on leveraging the rich latent space of pre-trained Stable Diffusion Models. We create a modified separable latent space to mix head and tail class examples. We build this space via Iterated Learning of underlying sparsified embeddings, which we apply to task-specific saliency maps via a K-NN approach. Code is available at https://github.com/SugarFreeManatee/Feature-Space-Augmentation-and-Iterated-Learning

View on arXiv PDF Code

Similar