LGDSSIMLNov 12, 2017

On the ERM Principle with Networked Data

arXiv:1711.04297v21 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in machine learning for networked data, such as in ranking and link prediction, by improving risk bounds over classical ERM, though it appears incremental as it builds on existing ERM principles.

The paper tackles the challenge of learning from networked data where target values are missing for some pairs, by proposing a general weighted ERM approach with new universal risk bounds and an FPTAS to solve the resulting non-convex optimization problem.

Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d.\ assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes