AI LG MLMay 11, 2014

Learning from networked examples

arXiv:1405.2600v416 citations

Originality Incremental advance

AI Analysis

This addresses a fundamental issue in machine learning for scenarios with dependent data, such as social networks or biological systems, offering a more robust approach.

The paper tackles the problem of learning from networked examples where training examples are not independent due to shared objects, showing that ignoring this can harm statistical accuracy. It provides improved sample error bounds and novel concentration inequalities through efficient sample weighting schemes.

Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities.

View on arXiv PDF

Similar