LGJun 14, 2015

A Fast Incremental Gaussian Mixture Model

arXiv:1506.04422v274 citations
AI Analysis

This incremental improvement addresses scalability for high-dimensional data streams in online learning applications.

The paper tackles the scalability issue of the Incremental Gaussian Mixture Network (IGMN) by reducing its time complexity from O(NKD^3) to O(NKD^2) using precision matrices, resulting in a faster algorithm suitable for high-dimensional data, as confirmed by tests on classification datasets.

This work builds upon previous efforts in online incremental learning, namely the Incremental Gaussian Mixture Network (IGMN). The IGMN is capable of learning from data streams in a single-pass by improving its model after analyzing each data point and discarding it thereafter. Nevertheless, it suffers from the scalability point-of-view, due to its asymptotic time complexity of $\operatorname{O}\bigl(NKD^3\bigr)$ for $N$ data points, $K$ Gaussian components and $D$ dimensions, rendering it inadequate for high-dimensional data. In this paper, we manage to reduce this complexity to $\operatorname{O}\bigl(NKD^2\bigr)$ by deriving formulas for working directly with precision matrices instead of covariance matrices. The final result is a much faster and scalable algorithm which can be applied to high dimensional tasks. This is confirmed by applying the modified algorithm to high-dimensional classification datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes