LGMLOct 16, 2012

Variational Dual-Tree Framework for Large-Scale Transition Matrix Approximation

arXiv:1210.4846v16 citations
Originality Incremental advance
AI Analysis

This addresses the problem of scaling graph-based machine learning methods for researchers and practitioners dealing with large datasets, though it is incremental as it builds on existing approximation techniques.

The paper tackles the scalability issue of non-parametric random walk methods on graphs by proposing a dual-tree variational framework for approximating transition matrices, achieving order-of-magnitude speedups without accuracy loss in Label Propagation tasks on benchmark datasets.

In recent years, non-parametric methods utilizing random walks on graphs have been used to solve a wide range of machine learning problems, but in their simplest form they do not scale well due to the quadratic complexity. In this paper, a new dual-tree based variational approach for approximating the transition matrix and efficiently performing the random walk is proposed. The approach exploits a connection between kernel density estimation, mixture modeling, and random walk on graphs in an optimization of the transition matrix for the data graph that ties together edge transitions probabilities that are similar. Compared to the de facto standard approximation method based on k-nearestneighbors, we demonstrate order of magnitudes speedup without sacrificing accuracy for Label Propagation tasks on benchmark data sets in semi-supervised learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes