Felix Wolf

h-index30

4papers

81citations

Novelty55%

AI Score43

Ranked #57,121 of 194,257 authors (top 29%)#74 in CE (top 22%)

4 Papers

1.2CESep 18, 2017

Recent Advances of Isogeometric Analysis in Computational Electromagnetics

Zeger Bontinck, Jacopo Corno, Herbert De Gersem et al.

In this communication the advantages and drawbacks of the isogeometric analysis (IGA) are reviewed in the context of electromagnetic simulations. IGA extends the set of polynomial basis functions, commonly employed by the classical Finite Element Method (FEM). While identical to FEM with Nédélec's basis functions in the lowest order case, it is based on B-spline and Non-Uniform Rational B-spline basis functions. The main benefit of this is the exact representation of the geometry in the language of computer aided design (CAD) tools. This simplifies the meshing as the computational mesh is implicitly created by the engineer using the CAD tool. The curl- and div-conforming spline function spaces are recapitulated and the available software is discussed. Finally, several non-academic benchmark examples in two and three dimensions are shown which are used in optimization and uncertainty quantification workflows.

7.4DCJun 2

I Like To Move It -- Computation Instead of Data in the Brain

Fabian Czappa, Marvin Kaster, Felix Wolf

The detailed functioning of the human brain remains incompletely understood. Large-scale brain simulations complement experimental research but face substantial computational challenges: the human brain comprises approximately $10^{11}$ neurons connected by $10^{14}$ synapses, collectively forming the connectome. Empirical evidence indicates that modifications of the connectome -- specifically the formation and elimination of synapses, referred to as structural plasticity -- are essential for processes such as learning and memory formation. Connectivity updates can be computed efficiently using a Barnes--Hut-inspired approximation that reduces computational complexity from $O(n^2)$ to $O(n \log n)$, where $n$ denotes the number of neurons. Despite this improvement, communication overhead still limits scalability. Synapse updates rely heavily on remote memory access (RMA), and spike transmission requires all-to-all communication at every simulation time step. We introduce a novel algorithm that reduces communication by migrating computation rather than data. This approach reduces connectivity update time by a factor of 6 and spike transmission time by more than 2 orders of magnitude.

4.3PLFeb 24, 2021

Learning to Make Compiler Optimizations More Effective

Rahim Mammadli, Marija Selakovic, Felix Wolf et al.

Because loops execute their body many times, compiler developers place much emphasis on their optimization. Nevertheless, in view of highly diverse source code and hardware, compilers still struggle to produce optimal target code. The sheer number of possible loop optimizations, including their combinations, exacerbates the problem further. Today's compilers use hard-coded heuristics to decide when, whether, and which of a limited set of optimizations to apply. Often, this leads to highly unstable behavior, making the success of compiler optimizations dependent on the precise way a loop has been written. This paper presents LoopLearner, which addresses the problem of compiler instability by predicting which way of writing a loop will lead to efficient compiled code. To this end, we train a neural network to find semantically invariant source-level transformations for loops that help the compiler generate more efficient code. Our model learns to extract useful features from the raw source code and predicts the speedup that a given transformation is likely to yield. We evaluate LoopLearner with 1,895 loops from various performance-relevant benchmarks. Applying the transformations that our model deems most favorable prior to compilation yields an average speedup of 1.14x. When trying the top-3 suggested transformations, the average speedup even increases to 1.29x. Comparing the approach with an exhaustive search through all available code transformations shows that LoopLearner helps to identify the most beneficial transformations in several orders of magnitude less time.

8.5LGAug 20, 2020

Static Neural Compiler Optimization via Deep Reinforcement Learning

Rahim Mammadli, Ali Jannesari, Felix Wolf

The phase-ordering problem of modern compilers has received a lot of attention from the research community over the years, yet remains largely unsolved. Various optimization sequences exposed to the user are manually designed by compiler developers. In designing such a sequence developers have to choose the set of optimization passes, their parameters and ordering within a sequence. Resulting sequences usually fall short of achieving optimal runtime for a given source code and may sometimes even degrade the performance when compared to unoptimized version. In this paper, we employ a deep reinforcement learning approach to the phase-ordering problem. Provided with sub-sequences constituting LLVM's O3 sequence, our agent learns to outperform the O3 sequence on the set of source codes used for training and achieves competitive performance on the validation set, gaining up to 1.32x speedup on previously-unseen programs. Notably, our approach differs from autotuning methods by not depending on one or more test runs of the program for making successful optimization decisions. It has no dependence on any dynamic feature, but only on the statically-attainable intermediate representation of the source code. We believe that the models trained using our approach can be integrated into modern compilers as neural optimization agents, at first to complement, and eventually replace the hand-crafted optimization sequences.