LGMLSep 5, 2019

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

arXiv:1909.02330v228 citations
Originality Incremental advance
AI Analysis

This work addresses generalization theory for machine learning with dependent data, which is common in real applications, but it appears incremental as it extends existing concentration inequalities to graph-dependent settings.

The paper tackles the problem of learning from non-i.i.d. data where dependencies are modeled by graphs, by proving new McDiarmid-type concentration inequalities based on forest complexity, which leads to stability bounds for such learning scenarios.

A crucial assumption in most statistical learning theory is that samples are independently and identically distributed (i.i.d.). However, for many real applications, the i.i.d. assumption does not hold. We consider learning problems in which examples are dependent and their dependency relation is characterized by a graph. To establish algorithm-dependent generalization theory for learning with non-i.i.d. data, we first prove novel McDiarmid-type concentration inequalities for Lipschitz functions of graph-dependent random variables. We show that concentration relies on the forest complexity of the graph, which characterizes the strength of the dependency. We demonstrate that for many types of dependent data, the forest complexity is small and thus implies good concentration. Based on our new inequalities we are able to build stability bounds for learning from graph-dependent data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes