AIApr 16, 2025

A Library of LLM Intrinsics for Retrieval-Augmented Generation

Marina Danilevsky, Kristjan Greenewald, Chulaka Gunasekara, Maeda Hanafi, Lihong He, Yannis Katsis, Krishnateja Killamsetty, Yulong Li, Yatin Nandwani, Lucian Popa, Dinesh Raghu, Frederick Reiss

IBM

arXiv:2504.11704v29.62 citationsh-index: 25

Originality Synthesis-oriented

AI Analysis

This addresses the problem of fragmented development for RAG applications in the LLM community, though it is incremental as it builds on existing concepts like compiler intrinsics.

The authors tackled the lack of standardized APIs for Retrieval-Augmented Generation (RAG) in large language models by proposing a library of LLM intrinsics, which are released as LoRA adapters on HuggingFace with documented software interfaces.

In the developer community for large language models (LLMs), there is not yet a clean pattern analogous to a software library, to support very large scale collaboration. Even for the commonplace use case of Retrieval-Augmented Generation (RAG), it is not currently possible to write a RAG application against a well-defined set of APIs that are agreed upon by different LLM providers. Inspired by the idea of compiler intrinsics, we propose some elements of such a concept through introducing a library of LLM Intrinsics for RAG. An LLM intrinsic is defined as a capability that can be invoked through a well-defined API that is reasonably stable and independent of how the LLM intrinsic itself is implemented. The intrinsics in our library are released as LoRA adapters on HuggingFace, and through a software interface with clear structured input/output characteristics on top of vLLM as an inference platform, accompanied in both places with documentation and code. This article describes the intended usage, training details, and evaluations for each intrinsic, as well as compositions of multiple intrinsics.

View on arXiv PDF

Similar