DCNANAApr 15

PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV

arXiv:2604.134332.1h-index: 2
Predicted impact top 98% in DC · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the memory bandwidth bottleneck in sparse linear algebra for GPU computing, offering a flexible format that can trade precision for performance.

PackSELL is a new sparse matrix format that reduces memory footprint and data movement for SpMV on GPUs, achieving up to 1.63x speedup over cuSPARSE SELL in FP16 and up to 2.09x speedup in mixed-precision PCG solvers.

We propose a new sparse matrix format, PackSELL, designed to support diverse data representations and enable efficient sparse matrix-vector multiplication (SpMV) on GPUs. Building on sliced ELLPACK (SELL), PackSELL incorporates delta encoding of column indices and a novel packing scheme that stores each index-delta-value pair in a single word, thereby reducing memory footprint and data movement. This design further enables fine-grained control over the bit allocation between deltas and values, allowing flexible data representations, including non-IEEE formats. Experimental results show that, when configured for half precision (FP16), the PackSELL-based SpMV kernel outperforms the cuSPARSE SELL-based kernel by up to $1.63\times$. Moreover, with configurations using customized formats, PackSELL achieves FP32-level accuracy while exceeding the performance of FP16 cuSPARSE. These benefits extend to sparse linear solvers; for example, a mixed-precision preconditioned conjugate gradient (PCG) solver using PackSELL achieves up to a $2.09\times$ speedup over the standard full-precision PCG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes