ARApr 27

Compilation and Execution of an Embeddable YOLO-NAS on the VTA

arXiv:2604.244558.8
Predicted impact top 89% in AR · last 90 daysOriginality Synthesis-oriented
AI Analysis

For developers deploying CNNs on FPGA accelerators in safety-critical domains, this work provides a fully automated compiler that handles larger models, overcoming prior limitations.

The paper extends and automates the VTA compilation chain to support larger CNNs, enabling successful compilation and simulated execution of a YOLO-NAS object detection model on an FPGA-based accelerator for safety-critical domains like aeronautics.

Deploying complex Convolutional Neural Networks (CNNs) on FPGA-based accelerators is a promising way forward for safety-critical domains such as aeronautics. In a previous work, we have explored the Versatile Tensor Accelerator (VTA) and showed its suitability for avionic applications. For that, we developed an initial stand-alone compiler designed with certification in mind. However, this compiler still suffers from some limitations that are overcome in this paper. The contributions consist in extending and fully automating the VTA compilation chain to allow complete CNN compilation and support larger CNNs (which parameters do not fit in the on-chip memory). The effectiveness is demonstrated by the successful compilation and simulated execution of a YOLO-NAS object detection model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes