MLAILGHEP-EXPRJun 20, 2023

Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations

arXiv:2306.11668v13 citationsh-index: 43
Originality Incremental advance
AI Analysis

This work addresses optimization and performance issues in deep GNNs for researchers and practitioners, though it is incremental as it builds on existing initialization and architecture techniques.

The authors tackled the problem of initialization and architecture selection in graph neural networks (GNNs) with ReLU activations, deriving principles that include a He-initialization generalization, proving oversmoothing is unavoidable in deep GNNs without residual aggregation, and showing that residual connections with fixup-type initialization avoid correlation collapse, leading to significantly faster early training dynamics in various tasks.

This article derives and validates three principles for initialization and architecture selection in finite width graph neural networks (GNNs) with ReLU activations. First, we theoretically derive what is essentially the unique generalization to ReLU GNNs of the well-known He-initialization. Our initialization scheme guarantees that the average scale of network outputs and gradients remains order one at initialization. Second, we prove in finite width vanilla ReLU GNNs that oversmoothing is unavoidable at large depth when using fixed aggregation operator, regardless of initialization. We then prove that using residual aggregation operators, obtained by interpolating a fixed aggregation operator with the identity, provably alleviates oversmoothing at initialization. Finally, we show that the common practice of using residual connections with a fixup-type initialization provably avoids correlation collapse in final layer features at initialization. Through ablation studies we find that using the correct initialization, residual aggregation operators, and residual connections in the forward pass significantly and reliably speeds up early training dynamics in deep ReLU GNNs on a variety of tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes