LGITMLFeb 13, 2019

Differential Description Length for Hyperparameter Selection in Machine Learning

arXiv:1902.04699v22 citations
Originality Incremental advance
AI Analysis

This addresses model selection for machine learning practitioners by providing a more accurate alternative to existing methods, though it appears incremental as it builds on minimum description length.

The paper tackles hyperparameter selection in machine learning by introducing differential description length (DDL), a method that predicts generalization error from training data alone, and shows it leads to smaller generalization error than cross-validation and traditional methods in experiments.

This paper introduces a new method for model selection and more generally hyperparameter selection in machine learning. Minimum description length (MDL) is an established method for model selection, which is however not directly aimed at minimizing generalization error, which is often the primary goal in machine learning. The paper demonstrates a relationship between generalization error and a difference of description lengths of the training data; we call this difference differential description length (DDL). This allows prediction of generalization error from the training data alone by performing encoding of the training data. DDL can then be used for model selection by choosing the model with the smallest predicted generalization error. We show how this method can be used for linear regression and neural networks and deep learning. Experimental results show that DDL leads to smaller generalization error than cross-validation and traditional MDL and Bayes methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes