DiffNAS: Bootstrapping Diffusion Models by Prompting for Better Architectures
This work addresses the need for better-performing diffusion models in synthetic data generation, though it is incremental as it builds on existing search and diffusion techniques.
The paper tackles the problem of designing efficient base models for diffusion models by proposing DiffNAS, a method that uses GPT-4 as a supernet to search for architectures, resulting in a 2x improvement in search efficiency and a 0.37 FID improvement on CIFAR10 compared to IDDPM.
Diffusion models have recently exhibited remarkable performance on synthetic data. After a diffusion path is selected, a base model, such as UNet, operates as a denoising autoencoder, primarily predicting noises that need to be eliminated step by step. Consequently, it is crucial to employ a model that aligns with the expected budgets to facilitate superior synthetic performance. In this paper, we meticulously analyze the diffusion model and engineer a base model search approach, denoted "DiffNAS". Specifically, we leverage GPT-4 as a supernet to expedite the search, supplemented with a search memory to enhance the results. Moreover, we employ RFID as a proxy to promptly rank the experimental outcomes produced by GPT-4. We also adopt a rapid-convergence training strategy to boost search efficiency. Rigorous experimentation corroborates that our algorithm can augment the search efficiency by 2 times under GPT-based scenarios, while also attaining a performance of 2.82 with 0.37 improvement in FID on CIFAR10 relative to the benchmark IDDPM algorithm.