Properties of the Stochastic Approximation EM Algorithm with Mini-batch Sampling
This work addresses scalability issues in statistical estimation for large datasets, but it is incremental as it adapts an existing algorithm with mini-batch sampling.
The authors tackled the challenge of handling very large datasets by proposing a mini-batch version of the Stochastic Approximation EM algorithm for latent variable models, showing that it provides an important speed-up in convergence and insights into mini-batch size effects.
To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classicalconditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models.In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.