LGOct 28, 2021

On Provable Benefits of Depth in Training Graph Convolutional Networks

Weilin Cong, Morteza Ramezani, Mehrdad Mahdavi

arXiv:2110.15174v123.692 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a key problem in graph neural networks for researchers and practitioners by providing a theoretical and practical solution to improve deep GCN performance, though it is incremental as it builds on existing GCN frameworks.

The paper tackles the performance degradation of deep Graph Convolutional Networks (GCNs) by showing that over-smoothing is not inevitable, and deep models can achieve high training accuracy but generalize poorly; it proposes a decoupled structure to preserve expressive power and improve generalization, validated on synthetic and real-world datasets.

Graph Convolutional Networks (GCNs) are known to suffer from performance degradation as the number of layers increases, which is usually attributed to over-smoothing. Despite the apparent consensus, we observe that there exists a discrepancy between the theoretical understanding of over-smoothing and the practical capabilities of GCNs. Specifically, we argue that over-smoothing does not necessarily happen in practice, a deeper model is provably expressive, can converge to global optimum with linear convergence rate, and achieve very high training accuracy as long as properly trained. Despite being capable of achieving high training accuracy, empirical results show that the deeper models generalize poorly on the testing stage and existing theoretical understanding of such behavior remains elusive. To achieve better understanding, we carefully analyze the generalization capability of GCNs, and show that the training strategies to achieve high training accuracy significantly deteriorate the generalization capability of GCNs. Motivated by these findings, we propose a decoupled structure for GCNs that detaches weight matrices from feature propagation to preserve the expressive power and ensure good generalization performance. We conduct empirical evaluations on various synthetic and real-world datasets to validate the correctness of our theory.

View on arXiv PDF Code

Similar