Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
This provides the first theoretical support for the optimization success of GNNs with skip connections, suggesting practical promise for deep GNNs, though it is incremental as it builds on existing GNN frameworks.
The paper tackles the optimization properties of Graph Neural Networks (GNNs) by analyzing gradient dynamics, proving linear convergence to a global minimum under mild assumptions and showing that skip connections, more depth, and good label distribution implicitly accelerate training, with empirical results aligning theory with nonlinear GNNs.
Graph Neural Networks (GNNs) have been studied through the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.