LG CVFeb 20, 2024

Neural Network Diffusion

Kai Wang, Dongwen Tang, Boya Zeng, Yida Yin, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, Yang You

arXiv:2402.13144v326.045 citationsh-index: 4Has Code

Originality Highly original

AI Analysis

This work introduces a novel application of diffusion models for neural network generation, which could benefit AI researchers and practitioners by enabling efficient model synthesis.

The paper tackles the problem of generating high-performing neural network parameters using diffusion models, achieving comparable or improved performance across various architectures and datasets with minimal additional cost.

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a diffusion model. The autoencoder extracts latent representations of a subset of the trained neural network parameters. Next, a diffusion model is trained to synthesize these latent representations from random noise. This model then generates new representations, which are passed through the autoencoder's decoder to produce new subsets of high-performing network parameters. Across various architectures and datasets, our approach consistently generates models with comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models are not memorizing the trained ones. Our results encourage more exploration into the versatile use of diffusion models. Our code is available \href{https://github.com/NUS-HPC-AI-Lab/Neural-Network-Diffusion}{here}.

View on arXiv PDF Code

Similar