Dominik Walter

2papers

2 Papers

4.0ARMar 30
Loop Control Management in Tightly Coupled Processor Arrays (TCPAs)

Dominik Walter, Frank Hannig, Jürgen Teich

Multidimensional loop kernels often suffer from control overhead that can dominate execution time on parallel loop accelerators. Tightly Coupled Processor Arrays (TCPAs) offload loop control to a global controller (GC), but existing approaches still require hundreds of control signals. We propose a method to derive and aggressively reduce these control conditions from a polyhedral representation of the iteration space, achieving reductions of 15x to 45x in control signals across several benchmarks. We introduce a lightweight GC architecture that evaluates conditions as unions of polyhedra using bounded evaluation units, requiring hardware comparable to a single processing element. Control signals are distributed throughout the array with a minimal number of delay elements resulting in zero-overhead loop control. Our evaluation on PolyBench kernels shows that the entire control flow requires < 10 % of the total array resources.

24.1ARApr 8
Symbolic Polyhedral-Based Energy Analysis for Nested Loop Programs

Avinash Mahesh Nirmala, Dominik Walter, Frank Hannig et al.

This work presents a symbolic approach for estimating the energy consumption for nested loop programs when mapped and scheduled on parallel processor array accelerator architectures. Instead of simulation-based evaluation, we derive a methodology for symbolic energy analysis that captures the impact of mapping and scheduling decisions of loop nests on processor arrays. We compare our approach against simulation-based results for selected benchmarks and varying sizes of the iteration spaces. Whereas the latter are not scalable, our symbolic analysis is shown to be independent of the problem size. The presented evaluation methodology can be beneficially used during the design space exploration of mapping and scheduling decisions, for studying the influence of array size variations, and for comparisons with other loop nest accelerator architectures.