LG SI MLMay 20, 2020

The Effects of Randomness on the Stability of Node Embeddings

Tobias Schumacher, Hinrikus Wolf, Martin Ritzert, Florian Lemmerich, Jan Bachmann, Florian Frantzen, Max Klabunde, Martin Grohe, Markus Strohmaier

arXiv:2005.10039v111.125 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses reproducibility and reliability issues for researchers and engineers using node embeddings, but it is incremental as it focuses on evaluating existing methods.

The study systematically evaluated the instability of state-of-the-art node embedding algorithms due to randomness, finding significant variations in embedding geometry and node classifications, though accuracy in downstream tasks remained unaffected.

We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

View on arXiv PDF Code

Similar