MLNov 21, 2017

On the EM-Tau algorithm: a new EM-style algorithm with partial E-steps

arXiv:1711.07814v11.85 citations

Originality Incremental advance

AI Analysis

This incremental improvement addresses computational bottlenecks for statisticians and data scientists working with large datasets in tasks like clustering.

The paper tackles the EM algorithm's inefficiency with large datasets by introducing the EM-Tau algorithm, which uses partial E-steps to approximate traditional EM with high accuracy and significantly reduced running time.

The EM algorithm is one of many important tools in the field of statistics. While often used for imputing missing data, its widespread applications include other common statistical tasks, such as clustering. In clustering, the EM algorithm assumes a parametric distribution for the clusters, whose parameters are estimated through a novel iterative procedure that is based on the theory of maximum likelihood. However, one major drawback of the EM algorithm, that renders it impractical especially when working with large datasets, is that it often requires several passes of the data before convergence. In this paper, we introduce a new EM-style algorithm that implements a novel policy for performing partial E-steps. We call the new algorithm the EM-Tau algorithm, which can approximate the traditional EM algorithm with high accuracy but with only a fraction of the running time.

View on arXiv PDF

Similar