MSDCNANASep 14, 2015

Speculative Segmented Sum for Sparse Matrix-Vector Multiplication on Heterogeneous Processors

arXiv:1504.06474Has Code
Originality Incremental advance
AI Analysis

It addresses the need for efficient SpMV on heterogeneous processors, which is a key operation in scientific and graph applications.

The paper proposes a speculative segmented sum approach for SpMV on CPU-GPU heterogeneous processors, achieving significant performance improvement over existing CSR-based algorithms across 20 sparse matrices on three platforms.

Sparse matrix-vector multiplication (SpMV) is a central building block for scientific software and graph applications. Recently, heterogeneous processors composed of different types of cores attracted much attention because of their flexible core configuration and high energy efficiency. In this paper, we propose a compressed sparse row (CSR) format based SpMV algorithm utilizing both types of cores in a CPU-GPU heterogeneous processor. We first speculatively execute segmented sum operations on the GPU part of a heterogeneous processor and generate a possibly incorrect results. Then the CPU part of the same chip is triggered to re-arrange the predicted partial sums for a correct resulting vector. On three heterogeneous processors from Intel, AMD and nVidia, using 20 sparse matrices as a benchmark suite, the experimental results show that our method obtains significant performance improvement over the best existing CSR-based SpMV algorithms. The source code of this work is downloadable at https://github.com/bhSPARSE/Benchmark_SpMV_using_CSR

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes