ARAIJul 7, 2025

Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE

arXiv:2508.14899v1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of optimizing AI compiler runtimes for RISC-V hardware, which is incremental as it extends an existing compiler with new target support.

This project enabled RISC-V microkernel support in IREE to accelerate GenAI workloads, achieving performance gains compared to upstream IREE and Llama.cpp for the Llama-3.2-1B-Instruct model.

This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to linalg.mmt4d op for the RISC-V64 target within the IREE pass pipeline, followed by the development of optimized microkernels for RISC-V. The performance gains are compared with upstream IREE and Llama.cpp for the Llama-3.2-1B-Instruct model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes