LG OCMar 31, 2022

Data Sampling Affects the Complexity of Online SGD over Dependent Data

Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang

arXiv:2204.00006v18.77 citations

Originality Incremental advance

AI Analysis

This addresses a practical issue for machine learning practitioners dealing with non-i.i.d. data, offering incremental improvements in optimization efficiency.

The paper tackles the problem of slow convergence in online stochastic gradient descent (SGD) when data samples are highly dependent, showing that proper periodic data-subsampling improves sample complexity across all dependence levels, with mini-batch sampling further enhancing it, as validated by numerical experiments.

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data samples, which are known to heavily bias the stochastic optimization process and slow down the convergence of learning. In this paper, we conduct a fundamental study on how different stochastic data sampling schemes affect the sample complexity of online stochastic gradient descent (SGD) over highly dependent data. Specifically, with a $φ$-mixing model of data dependence, we show that online SGD with proper periodic data-subsampling achieves an improved sample complexity over the standard online SGD in the full spectrum of the data dependence level. Interestingly, even subsampling a subset of data samples can accelerate the convergence of online SGD over highly dependent data. Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data. Numerical experiments validate our theoretical results.

View on arXiv PDF

Similar