LGAIDCNov 22, 2023

Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores

arXiv:2311.13693v11 citationsh-index: 13
Originality Highly original
AI Analysis

This work addresses a critical bottleneck in data science applications like gene analysis and deep learning by enabling tensor decomposition for trillion-scale data, representing a significant advance over existing billion-scale methods.

The paper tackles the challenge of scaling CP tensor decomposition to exascale tensors, proposing a compression-based framework that supports 8,000x larger tensors and achieves up to 6.95x speedup compared to baselines.

CP decomposition is a powerful tool for data science, especially gene analysis, deep learning, and quantum computation. However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors. While the data in our real world is usually presented as trillion- or even exascale-scale tensors, existing work can only support billion-scale scale tensors. In our work, we propose the Exascale-Tensor to mitigate the significant gap. Specifically, we propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Then, we carefully analyze the inherent parallelism and propose a bag of strategies to improve computational efficiency. Last, we conduct experiments to decompose tensors ranging from million-scale to trillion-scale for evaluation. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks, of which the numeric results demonstrate the scalability and effectiveness of our method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes