Generative Modeling of Weights: Generalization or Memorization?
This work reveals a memorization problem in generative modeling of weights, challenging prior claims and highlighting evaluation gaps for researchers in generative AI and neural network optimization.
The paper examined four generative methods for synthesizing neural network weights and found they largely produce memorized or interpolated versions of training checkpoints rather than novel weights, failing to outperform simple baselines like adding noise or weight ensembles.
Generative models have recently been explored for synthesizing neural network weights. These approaches take neural network checkpoints as training data and aim to generate high-performing weights during inference. In this work, we examine four representative, well-known methods on their ability to generate novel model weights, i.e., weights that are different from the checkpoints seen during training. Contrary to claims in prior work, we find that these methods synthesize weights largely by memorization: they produce either replicas, or, at best, simple interpolations of the training checkpoints. Moreover, they fail to outperform simple baselines, such as adding noise to the weights or taking a simple weight ensemble, in obtaining different and simultaneously high-performing models. Our further analysis suggests that this memorization might result from limited data, overparameterized models, and the underuse of structural priors specific to weight data. These findings highlight the need for more careful design and rigorous evaluation of generative models when applied to new domains. Our code is available at https://github.com/boyazeng/weight_memorization.