$\bar{G}_{mst}$:An Unbiased Stratified Statistic and a Fast Gradient Optimization Algorithm Based on It
This work addresses a specific bottleneck in gradient optimization for deep learning, offering an incremental improvement over existing methods.
The paper tackles the problem of gradient expectation and variance fluctuations during parameter updates in optimization algorithms by introducing an unbiased stratified statistic, δ̅_{mst}, and a novel algorithm, MSSG, which outperforms other SGD-like algorithms in training deep models.
-The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms. The work in this paper remedy this issue by introducing a novel unbiased stratified statistic \ $\bar{G}_{mst}$\ , a sufficient condition of fast convergence for \ $\bar{G}_{mst}$\ also is established. A novel algorithm named MSSG designed based on \ $\bar{G}_{mst}$\ outperforms other sgd-like algorithms. Theoretical conclusions and experimental evidence strongly suggest to employ MSSG when training deep model.