LGMEApr 15, 2012

Plug-in martingales for testing exchangeability on-line

arXiv:1204.3251v277 citations
Originality Incremental advance
AI Analysis

This work addresses the need for valid, real-time measures of exchangeability in machine learning, which is crucial for ensuring data assumptions hold during online learning, though it is incremental as it builds on prior martingale techniques.

The paper tackles the problem of testing exchangeability of data in an online setting, where examples arrive sequentially, and introduces a new method for constructing exchangeability martingales that is competitive with existing techniques, showing satisfactory results on the USPS dataset and improved performance on the Statlog Satellite dataset.

A standard assumption in machine learning is the exchangeability of data, which is equivalent to assuming that the examples are generated from the same probability distribution independently. This paper is devoted to testing the assumption of exchangeability on-line: the examples arrive one by one, and after receiving each example we would like to have a valid measure of the degree to which the assumption of exchangeability has been falsified. Such measures are provided by exchangeability martingales. We extend known techniques for constructing exchangeability martingales and show that our new method is competitive with the martingales introduced before. Finally we investigate the performance of our testing method on two benchmark datasets, USPS and Statlog Satellite data; for the former, the known techniques give satisfactory results, but for the latter our new more flexible method becomes necessary.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes