LGDSIRMLMay 12, 2015

The Boundary Forest Algorithm for Online Supervised and Unsupervised Learning

arXiv:1505.02867v130 citations
Originality Highly original
AI Analysis

This algorithm addresses the need for fast, online learning in real-time applications, offering a flexible approach for various data manifolds.

The paper introduces the Boundary Forest algorithm, a new instance-based method for online supervised and unsupervised learning that updates incrementally and achieves fast training and testing times, with empirical scaling of O(DNlog(N)) for training and O(Dlog(N)) for testing.

We describe a new instance-based learning algorithm called the Boundary Forest (BF) algorithm, that can be used for supervised and unsupervised learning. The algorithm builds a forest of trees whose nodes store previously seen examples. It can be shown data points one at a time and updates itself incrementally, hence it is naturally online. Few instance-based algorithms have this property while being simultaneously fast, which the BF is. This is crucial for applications where one needs to respond to input data in real time. The number of children of each node is not set beforehand but obtained from the training procedure, which makes the algorithm very flexible with regards to what data manifolds it can learn. We test its generalization performance and speed on a range of benchmark datasets and detail in which settings it outperforms the state of the art. Empirically we find that training time scales as O(DNlog(N)) and testing as O(Dlog(N)), where D is the dimensionality and N the amount of data,

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes