Towards Programmable Memory Controller for Tensor Decomposition
This work addresses a domain-specific problem for data science applications by providing an incremental improvement in FPGA-based acceleration for tensor decomposition.
The paper tackles the challenge of accelerating sparse MTTKRP, a key kernel in tensor decomposition, by designing a custom memory controller on FPGA to address irregular memory access patterns, resulting in improved performance and energy efficiency.
Tensor decomposition has become an essential tool in many data science applications. Sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is the pivotal kernel in tensor decomposition algorithms that decompose higher-order real-world large tensors into multiple matrices. Accelerating MTTKRP can speed up the tensor decomposition process immensely. Sparse MTTKRP is a challenging kernel to accelerate due to its irregular memory access characteristics. Implementing accelerators on Field Programmable Gate Array (FPGA) for kernels such as MTTKRP is attractive due to the energy efficiency and the inherent parallelism of FPGA. This paper explores the opportunities, key challenges, and an approach for designing a custom memory controller on FPGA for MTTKRP while exploring the parameter space of such a custom memory controller.