CVNov 27, 2024

HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation

arXiv:2411.17784v22 citationsh-index: 3
Originality Highly original
AI Analysis

This work addresses the challenge of generating diverse, high-quality images for unseen categories with limited examples, offering improved control and interpretability in image generation.

The paper tackled the problem of few-shot image generation by balancing category consistency and image diversity, proposing HypDAE, which uses hyperbolic space to capture hierarchical relationships and outperforms prior methods with better balance and controllability.

Few-shot image generation aims to generate diverse and high-quality images for an unseen class given only a few examples in that class. A key challenge in this task is balancing category consistency and image diversity, which often compete with each other. Moreover, existing methods offer limited control over the attributes of newly generated images. In this work, we propose Hyperbolic Diffusion Autoencoders (HypDAE), a novel approach that operates in hyperbolic space to capture hierarchical relationships among images from seen categories. By leveraging pre-trained foundation models, HypDAE generates diverse new images for unseen categories with exceptional quality by varying stochastic subcodes or semantic codes. Most importantly, the hyperbolic representation introduces an additional degree of control over semantic diversity through the adjustment of radii within the hyperbolic disk. Extensive experiments and visualizations demonstrate that HypDAE significantly outperforms prior methods by achieving a better balance between preserving category-relevant features and promoting image diversity with limited data. Furthermore, HypDAE offers a highly controllable and interpretable generation process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes