LGNov 2, 2024

Optimization of GNN Training Through Half-precision

Arnab Kanti Tarafder, Yidong Gong, Pradeep Kumar

arXiv:2411.01109v26.44 citationsh-index: 2HPDC

Originality Incremental advance

AI Analysis

This work addresses performance and memory bottlenecks for GNN training, offering a domain-specific optimization that is incremental but provides concrete gains.

The paper tackled the problem of half-precision training underperforming in GNN systems due to issues like value overflow and poor hardware utilization, and introduced HalfGNN, which achieved an average 2.30x speedup in training time and 2.67x memory savings while maintaining similar accuracy.

Recent trends in lower precision, e.g. half-precision floating point, training have shown improved system performance and reduced memory usage for Deep Learning while maintaining accuracy. However, current GNN systems cannot achieve such goals for GNN, as our analyses show that they massively underperform while showing abnormal accuracy when using half-precision. These systems suffer from value overflow issues due to lowered precision, under-utilization of hardware resources, and poor training performance. To mitigate this, we introduce HalfGNN, a half-precision based GNN system. HalfGNN proposes novel techniques: new vector operations for half-precision data types that improve data load and reduction performance, and discretized SpMM that overcomes the value overflow and natively provides workload balancing. Such techniques improve hardware utilization, reduce memory usage, and remove atomic writes. Evaluations show that HalfGNN achieves on average of 2.30X speedup in training time over DGL (float-based) for GAT, GCN, and GIN respectively while achieving similar accuracy, and saving 2.67X memory.

View on arXiv PDF

Similar