A mixed precision semi-Lagrangian algorithm and its performance on accelerators
For computational scientists using semi-Lagrangian methods, this work offers a practical mixed precision approach that improves performance on accelerators.
The paper proposes a mixed precision semi-Lagrangian discontinuous Galerkin algorithm and evaluates its performance on CPUs, Xeon Phi, and NVIDIA K80, finding substantial memory reduction and performance increase.
In this paper we propose a mixed precision algorithm in the context of the semi-Lagrangian discontinuous Galerkin method. The performance of this approach is evaluated on a traditional dual socket workstation as well as on a Xeon Phi and an NVIDIA K80. We find that the mixed precision algorithm can be implemented efficiently on these architectures. This implies that, in addition to the considerable reduction in memory, a substantial increase in performance can be observed as well. Moreover, we discuss the relative performance of our implementations.