ARLGDec 24, 2024

GCN-ABFT: Low-Cost Online Error Checking for Graph Convolutional Networks

arXiv:2412.18534v13 citationsh-index: 19IEEE Trans Comput Des Integr Circuit Syst
Originality Incremental advance
AI Analysis

This work addresses a key architectural challenge for GCN hardware accelerators, offering a low-cost error-checking solution that is incremental in improving existing Algorithm-based Fault Tolerance techniques.

The paper tackles the problem of detecting errors in Graph Convolutional Network (GCN) accelerators due to hardware faults by introducing GCN-ABFT, a method that reduces checksum computation operations by over 21% on average without compromising fault-detection accuracy.

Graph convolutional networks (GCNs) are popular for building machine-learning application for graph-structured data. This widespread adoption led to the development of specialized GCN hardware accelerators. In this work, we address a key architectural challenge for GCN accelerators: how to detect errors in GCN computations arising from random hardware faults with the least computation cost. Each GCN layer performs a graph convolution, mathematically equivalent to multiplying three matrices, computed through two separate matrix multiplications. Existing Algorithm-based Fault Tolerance(ABFT) techniques can check the results of individual matrix multiplications. However, for a GCN layer, this check should be performed twice. To avoid this overhead, this work introduces GCN-ABFT that directly calculates a checksum for the entire three-matrix product within a single GCN layer, providing a cost-effective approach for error detection in GCN accelerators. Experimental results demonstrate that GCN-ABFT reduces the number of operations needed for checksum computation by over 21% on average for representative GCN applications. These savings are achieved without sacrificing fault-detection accuracy, as evidenced by the presented fault-injection analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes