LoRAP: Low-Rank Aggregation Prompting for Quantized Graph Neural Networks Training
This addresses the problem of efficient GNN deployment in resource-constrained environments by improving quantization performance, though it appears incremental as it builds on existing QAT frameworks with a novel prompting technique.
The paper tackles the problem of performance degradation in quantized Graph Neural Networks (GNNs) during quantization-aware training, proposing LoRAP (Low-Rank Aggregation Prompting) which injects lightweight prompts into aggregated features to optimize quantized aggregations. The result shows consistent performance enhancements for low-bit quantized GNNs across 4 QAT frameworks and 9 graph datasets with minimal computational overhead.
Graph Neural Networks (GNNs) are neural networks that aim to process graph data, capturing the relationships and interactions between nodes using the message-passing mechanism. GNN quantization has emerged as a promising approach for reducing model size and accelerating inference in resource-constrained environments. Compared to quantization in LLMs, quantizing graph features is more emphasized in GNNs. Inspired by the above, we propose to leverage prompt learning, which manipulates the input data, to improve the performance of quantization-aware training (QAT) for GNNs. To mitigate the issue that prompting the node features alone can only make part of the quantized aggregation result optimal, we introduce Low-Rank Aggregation Prompting (LoRAP), which injects lightweight, input-dependent prompts into each aggregated feature to optimize the results of quantized aggregations. Extensive evaluations on 4 leading QAT frameworks over 9 graph datasets demonstrate that LoRAP consistently enhances the performance of low-bit quantized GNNs while introducing a minimal computational overhead.