LGFeb 7, 2024

Strong convexity-guided hyper-parameter optimization for flatter losses

arXiv:2402.05025v1h-index: 8
Originality Incremental advance
AI Analysis

This addresses hyper-parameter tuning for neural networks, but it appears incremental as it builds on existing flat minima theory.

The paper tackled hyper-parameter optimization by linking strong convexity to loss flatness for better generalization, and showed that their method achieves strong performance with reduced runtime on 14 classification datasets.

We propose a novel white-box approach to hyper-parameter optimization. Motivated by recent work establishing a relationship between flat minima and generalization, we first establish a relationship between the strong convexity of the loss and its flatness. Based on this, we seek to find hyper-parameter configurations that improve flatness by minimizing the strong convexity of the loss. By using the structure of the underlying neural network, we derive closed-form equations to approximate the strong convexity parameter, and attempt to find hyper-parameters that minimize it in a randomized fashion. Through experiments on 14 classification datasets, we show that our method achieves strong performance at a fraction of the runtime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes