SEApr 1

AI Engineering Blueprint for On-Premises Retrieval-Augmented Generation Systems

arXiv:2604.0139559.1h-index: 16
AI Analysis

It addresses the problem of deploying secure, scalable RAG systems on-premises for enterprises with strict data regulations, though it appears incremental by building on existing cloud-focused blueprints.

This paper tackles the lack of comprehensive frameworks for on-premises retrieval-augmented generation (RAG) systems in enterprises due to data protection regulations, by presenting an AI engineering blueprint that includes a reference architecture, application, and best practices, with ongoing case studies to evaluate its benefits.

Retrieval-augmented generation (RAG) systems are gaining traction in enterprise settings, yet stringent data protection regulations prevent many organizations from using cloud-based services, necessitating on-premises deployments. While existing blueprints and reference architectures focus on cloud deployments and lack enterprise-grade components, comprehensive on-premises implementation frameworks remain scarce. This paper aims to address this gap by presenting a comprehensive AI engineering blueprint for scalable on-premises enterprise RAG solutions. It is designed to address common challenges and streamline the integration of RAG into existing enterprise infrastructure. The blueprint provides: (1) an end-to-end reference architecture described using the 4+1 view model, (2) a reference application for on-premises deployment, and (3) best practices for tooling, development, and CI/CD pipelines, all publicly available on GitHub. Ongoing case studies and expert interviews with industry partners will assess its practical benefits.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes