ML LGDec 17, 2018

Domain Adaptation on Graphs by Learning Graph Topologies: Theoretical Analysis and an Algorithm

arXiv:1812.06944v24.26 citations

Originality Incremental advance

AI Analysis

This work addresses domain adaptation for graph-structured data, which is incremental as it extends existing domain adaptation concepts to graphs with a focus on topology learning.

The paper tackles domain adaptation on graphs by learning graph topologies to estimate unknown labels on a target graph using source graph labels and graph similarity, with theoretical bounds showing improved classification as graph topologies become more balanced and edges between distant samples are avoided. Experiments indicate the proposed method outperforms baseline approaches on synthetic and real datasets.

Traditional machine learning algorithms assume that the training and test data have the same distribution, while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account the deviations in the data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a source graph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating the unknown class labels on the target graph using the label information on the source graph and the similarity between the two graphs. We particularly focus on a setting where the target label function is learnt such that its spectrum is similar to that of the source label function. We first propose a theoretical analysis of domain adaptation on graphs and present performance bounds that characterize the target classification error in terms of the properties of the graphs and the data manifolds. We show that the classification performance improves as the topologies of the graphs get more balanced, i.e., as the numbers of neighbors of different graph nodes become more proportionate, and weak edges with small weights are avoided. Our results also suggest that graph edges between too distant data samples should be avoided for good generalization performance. We then propose a graph domain adaptation algorithm inspired by our theoretical findings, which estimates the label functions while learning the source and target graph topologies at the same time. The joint graph learning and label estimation problem is formulated through an objective function relying on our performance bounds, which is minimized with an alternating optimization scheme. Experiments on synthetic and real data sets suggest that the proposed method outperforms baseline approaches.

View on arXiv PDF

Similar