LGMLApr 28, 2023

Hyperparameter Optimization through Neural Network Partitioning

arXiv:2304.14766v112 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses the challenge of hyperparameter tuning for neural networks, particularly in data-limited or federated settings, offering a more efficient solution.

The authors tackled hyperparameter optimization in neural networks by proposing a method that partitions data and model parameters to define an out-of-training-sample loss, enabling efficient optimization without validation data. They demonstrated computational efficiency and applied it to federated learning, achieving significant cost reductions compared to alternatives.

Well-tuned hyperparameters are crucial for obtaining good generalization behavior in neural networks. They can enforce appropriate inductive biases, regularize the model and improve performance -- especially in the presence of limited data. In this work, we propose a simple and efficient way for optimizing hyperparameters inspired by the marginal likelihood, an optimization objective that requires no validation data. Our method partitions the training data and a neural network model into $K$ data shards and parameter partitions, respectively. Each partition is associated with and optimized only on specific data shards. Combining these partitions into subnetworks allows us to define the ``out-of-training-sample" loss of a subnetwork, i.e., the loss on data shards unseen by the subnetwork, as the objective for hyperparameter optimization. We demonstrate that we can apply this objective to optimize a variety of different hyperparameters in a single training run while being significantly computationally cheaper than alternative methods aiming to optimize the marginal likelihood for neural networks. Lastly, we also focus on optimizing hyperparameters in federated learning, where retraining and cross-validation are particularly challenging.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes