DC LG SPMay 22, 2023

Distributed Learning over Networks with Graph-Attention-Based Personalization

Zhuojun Tian, Zhaoyang Zhang, Zhaohui Yang, Richeng Jin, Huaiyu Dai

arXiv:2305.13041v15.913 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of data heterogeneity in distributed networks for machine learning practitioners, offering a personalized approach that is incremental over existing methods.

The paper tackles the inefficiency of conventional distributed learning models under non-i.i.d. data by proposing GATTA, a graph-attention-based personalized training algorithm that enables agents to train local models while leveraging neighbor correlations, resulting in improved convergence and reduced communication costs as validated numerically.

In conventional distributed learning over a network, multiple agents collaboratively build a common machine learning model. However, due to the underlying non-i.i.d. data distribution among agents, the unified learning model becomes inefficient for each agent to process its locally accessible data. To address this problem, we propose a graph-attention-based personalized training algorithm (GATTA) for distributed deep learning. The GATTA enables each agent to train its local personalized model while exploiting its correlation with neighboring nodes and utilizing their useful information for aggregation. In particular, the personalized model in each agent is composed of a global part and a node-specific part. By treating each agent as one node in a graph and the node-specific parameters as its features, the benefits of the graph attention mechanism can be inherited. Namely, instead of aggregation based on averaging, it learns the specific weights for different neighboring nodes without requiring prior knowledge about the graph structure or the neighboring nodes' data distribution. Furthermore, relying on the weight-learning procedure, we develop a communication-efficient GATTA by skipping the transmission of information with small aggregation weights. Additionally, we theoretically analyze the convergence properties of GATTA for non-convex loss functions. Numerical results validate the excellent performances of the proposed algorithms in terms of convergence and communication cost.

View on arXiv PDF Code

Similar