LGCVFeb 20, 2024

Neural Network Diffusion

arXiv:2402.13144v345 citationsh-index: 4Has Code
Originality Highly original
AI Analysis

This work introduces a novel application of diffusion models for neural network generation, which could benefit AI researchers and practitioners by enabling efficient model synthesis.

The paper tackles the problem of generating high-performing neural network parameters using diffusion models, achieving comparable or improved performance across various architectures and datasets with minimal additional cost.

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a diffusion model. The autoencoder extracts latent representations of a subset of the trained neural network parameters. Next, a diffusion model is trained to synthesize these latent representations from random noise. This model then generates new representations, which are passed through the autoencoder's decoder to produce new subsets of high-performing network parameters. Across various architectures and datasets, our approach consistently generates models with comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models are not memorizing the trained ones. Our results encourage more exploration into the versatile use of diffusion models. Our code is available \href{https://github.com/NUS-HPC-AI-Lab/Neural-Network-Diffusion}{here}.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes