LGMLDec 9, 2017

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent

arXiv:1712.03428v11 citations
AI Analysis

This work addresses the trade-off between gradient accuracy and computational cost in optimization for machine learning practitioners, but it appears incremental as it builds on existing methods.

The paper tackles the problem of automatically determining batch size in stochastic gradient descent by optimizing a cost-sensitive ratio of expected improvement to sample count, and empirically compares it with related methods on classification tasks.

In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximation of the expected improvement and the number of samples used to estimate the gradient. The performance of the proposed approach is empirically compared with related methods on popular classification tasks. The work was presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 2016.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes