ARAILGApr 21

Design Rules for Extreme-Edge Scientific Computing on AI Engines

arXiv:2604.1910612.81 citationsh-index: 32
Predicted impact top 43% in AR · last 90 daysOriginality Incremental advance
AI Analysis

For developers of extreme-edge scientific applications, this provides design rules to choose between AI Engines and programmable logic, addressing a scalability bottleneck.

This work characterizes AI Engines vs. programmable logic for extreme-edge scientific neural networks, introducing a LARE metric to identify when AI Engines outperform PL, and demonstrates deployment of networks that cannot fit on PL using hls4ml.

Extreme-edge scientific applications use machine learning models to analyze sensor data and make real-time decisions. Their stringent latency and throughput requirements demand small batch sizes and require that model weights remain fully on-chip. Spatial dataflow implementations are common for extreme-edge applications. Spatial dataflow works well for small networks, but it fails to scale to larger models due to inherent resource scaling limitations. AI Engines on modern FPGA SoCs offer a promising alternative with high compute density and additional on-chip memory. However, the architecture, programming model, and performance-scaling behavior of AI Engines differ fundamentally from those of the programmable logic, making direct comparison non-trivial and the benefits of using AI Engines unclear. This work addresses how and when extreme-edge scientific neural networks should be implemented on AI Engines versus programmable logic. We provide systematic architectural characterization and micro-benchmarking and introduce a latency-adjusted resource equivalence (LARE) metric that identifies when AI Engine implementations outperform programmable logic designs. We further propose spatial and API-level dataflow optimizations tailored to low-latency scientific inference. Finally, we demonstrate the successful deployment of end-to-end neural networks on AI Engines that cannot fit on programmable logic when using the hlsml toolchain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes