The Impact of Model Zoo Size and Composition on Weight Space Learning
This work addresses the problem of limited generalization in weight space learning for AI researchers, representing an incremental improvement by extending existing methods to handle diverse model zoos.
The paper tackles the limitation of weight space learning methods requiring homogeneous model architectures by proposing a modification to accommodate heterogeneous model populations, showing that including models with varying underlying image datasets significantly improves performance and generalization in both in- and out-of-distribution settings.
Re-using trained neural network models is a common strategy to reduce training cost and transfer knowledge. Weight space learning - using the weights of trained models as data modality - is a promising new field to re-use populations of pre-trained models for future tasks. Approaches in this field have demonstrated high performance both on model analysis and weight generation tasks. However, until now their learning setup requires homogeneous model zoos where all models share the same exact architecture, limiting their capability to generalize beyond the population of models they saw during training. In this work, we remove this constraint and propose a modification to a common weight space learning method to accommodate training on heterogeneous populations of models. We further investigate the resulting impact of model diversity on generating unseen neural network model weights for zero-shot knowledge transfer. Our extensive experimental evaluation shows that including models with varying underlying image datasets has a high impact on performance and generalization, for both in- and out-of-distribution settings. Code is available on github.com/HSG-AIML/MultiZoo-SANE.