Hartwig Anzt

h-index25

5papers

2,185citations

Novelty30%

AI Score38

Ranked #85,710 of 194,257 authors (top 44%)#407 in DC (top 42%)

5 Papers

8.7NAApr 29Code

Permutation-Avoiding FFT-Based Convolution

Nicolas Venkovic, Hartwig Anzt

Fast Fourier Transform (FFT) libraries are widely used for evaluating discrete convolutions. Most FFT implementations follow some variant of the Cooley-Tukey framework, in which the transform is decomposed into butterfly operations and index-reversal permutations. While butterfly operations dominate the floating-point operation count, the memory access patterns induced by index-reversal permutations significantly degrade the FFT's arithmetic intensity. When performing discrete convolution, the three sets of index-reversal permutations which occur in FFT-based implementations using Cooley-Tukey frameworks cancel out, thus paving the way to implementations free of any permutation. To the best of our knowledge, such permutation-free variants of FFT-based discrete convolution are not commonly used in practice, making such kernels worth investigating. Here, we look into such permutation-avoiding convolution procedures for multi-dimensional cases within a general radix Cooley-Tukey framework. We perform numerical experiments to benchmark the algorithms presented against state-of-the-art FFT-based convolution implementations. Our results suggest that developers of FFT libraries should consider supporting permutation-avoiding convolution kernels.

2.3AO-PHSep 16, 2023

Earth Virtualization Engines -- A Technical Perspective

Torsten Hoefler, Bjorn Stevens, Andreas F. Prein et al.

Participants of the Berlin Summit on Earth Virtualization Engines (EVEs) discussed ideas and concepts to improve our ability to cope with climate change. EVEs aim to provide interactive and accessible climate simulations and data for a wide range of users. They combine high-resolution physics-based models with machine learning techniques to improve the fidelity, efficiency, and interpretability of climate projections. At their core, EVEs offer a federated data layer that enables simple and fast access to exabyte-sized climate data through simple interfaces. In this article, we summarize the technical challenges and opportunities for developing EVEs, and argue that they are essential for addressing the consequences of climate change.

4.5PFJun 18

Randomized Sketching is Robust to Low-Precision Rounding on GPUs

Aryaman Jeendgar, Clément Flint, Hartwig Anzt

Randomized sketching is a core primitive in randomized numerical linear algebra. On modern hardware architectures, in particular on GPUs, the performance of sparse sketches is limited by memory traffic and atomic accumulation rather than floating-point throughput. This makes sketching a natural target for mixed precision, provided that low-precision accumulation does not degrade the embedding quality. We study mixed-precision GPU implementations of sparse oblivious subspace embeddings, focusing on a SparseStack generalization of the GPU CountSketch kernel of Higgins et al. SparseStack improves embedding quality relative to CountSketch on coherent inputs, but its additional nonzeros per column increase atomic-update contention and reduce throughput. We therefore implement FP16 SparseStack variants using deterministic round-to-nearest, exact stochastic rounding, and dithered rounding, and compare them with FP32 SparseStack, CountSketch, mixed-precision CountSketch, and FlashSketch. Our main empirical finding is that, for the tested regimes, SparseStack embedding quality is insensitive to the FP16 rounding rule. Deterministic, stochastic, and dithered rounding FP16 SparseStack produce nearly identical subspace distortion and sketch-and-solve least-squares accuracy across incoherent, coherent, and adversarial test problems. The dominant accuracy factor is the sketch distribution rather than the quantization rule: SparseStack variants substantially improve distortion on coherent inputs, while all methods behave similarly on incoherent inputs. Since deterministic rounding has the lowest overhead, it provides the best performance--accuracy tradeoff among the FP16 SparseStack variants.

1.2DCNov 17, 2020

Ginkgo -- A Math Library designed for Platform Portability

Terry Cojean, Yu-Hsiang "Mike" Tsai, Hartwig Anzt

The first associations to software sustainability might be the existence of a continuous integration (CI) framework; the existence of a testing framework composed of unit tests, integration tests, and end-to-end tests; and also the existence of software documentation. However, when asking what is a common deathblow for a scientific software product, it is often the lack of platform and performance portability. Against this background, we designed the Ginkgo library with the primary focus on platform portability and the ability to not only port to new hardware architectures, but also achieve good performance. In this paper we present the Ginkgo library design, radically separating algorithms from hardware-specific kernels forming the distinct hardware executors, and report our experience when adding execution backends for NVIDIA, AMD, and Intel GPUs. We also comment on the different levels of performance portability, and the performance we achieved on the distinct hardware backends.

5.1GLApr 27, 2020

An Environment for Sustainable Research Software in Germany and Beyond: Current State, Open Challenges, and Call for Action

Hartwig Anzt, Felix Bach, Stephan Druskat et al.

Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively. In other words, software must be available, discoverable, usable, and adaptable to new needs, both now and in the future. Research software therefore requires an environment that supports sustainability. Hence, a change is needed in the way research software development and maintenance are currently motivated, incentivized, funded, structurally and infrastructurally supported, and legally treated. Failing to do so will threaten the quality and validity of research. In this paper, we identify challenges for research software sustainability in Germany and beyond, in terms of motivation, selection, research software engineering personnel, funding, infrastructure, and legal aspects. Besides researchers, we specifically address political and academic decision-makers to increase awareness of the importance and needs of sustainable research software practices. In particular, we recommend strategies and measures to create an environment for sustainable research software, with the ultimate goal to ensure that software-driven research is valid, reproducible and sustainable, and that software is recognized as a first class citizen in research. This paper is the outcome of two workshops run in Germany in 2019, at deRSE19 - the first International Conference of Research Software Engineers in Germany - and a dedicated DFG-supported follow-up workshop in Berlin.