DemoFusion: Democratising High-Resolution Image Generation With No $$$
It aims to democratize high-resolution image generation for a broad audience by making it more accessible without requiring large capital investments, though it is incremental as it builds on existing models.
The paper tackles the centralization of high-resolution image generation by demonstrating that existing Latent Diffusion Models have untapped potential for higher-resolution outputs, achieving this through the DemoFusion framework with mechanisms like Progressive Upscaling, Skip Residual, and Dilated Sampling.
High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls. This paper aims to democratise high-resolution GenAI by advancing the frontier of high-resolution generation while remaining accessible to a broad audience. We demonstrate that existing Latent Diffusion Models (LDMs) possess untapped potential for higher-resolution image generation. Our novel DemoFusion framework seamlessly extends open-source GenAI models, employing Progressive Upscaling, Skip Residual, and Dilated Sampling mechanisms to achieve higher-resolution image generation. The progressive nature of DemoFusion requires more passes, but the intermediate results can serve as "previews", facilitating rapid prompt iteration.