CLAIJul 13, 2025

eSapiens's DEREK Module: Deep Extraction & Reasoning Engine for Knowledge with LLMs

arXiv:2507.15863v11 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This provides a production-ready solution for secure and auditable document QA in high-stakes domains like legal and finance, though it is incremental as it combines existing techniques.

The paper tackles enterprise document question answering by developing a secure Retrieval-Augmented Generation pipeline called DEREK, which improves Precision@10 by approximately 7 percentage points and limits unsupported statements to less than 3% using a verifier.

We present the DEREK (Deep Extraction & Reasoning Engine for Knowledge) Module, a secure and scalable Retrieval-Augmented Generation pipeline designed specifically for enterprise document question answering. Designed and implemented by eSapiens, the system ingests heterogeneous content (PDF, Office, web), splits it into 1,000-token overlapping chunks, and indexes them in a hybrid HNSW+BM25 store. User queries are refined by GPT-4o, retrieved via combined vector+BM25 search, reranked with Cohere, and answered by an LLM using CO-STAR prompt engineering. A LangGraph verifier enforces citation overlap, regenerating answers until every claim is grounded. On four LegalBench subsets, 1000-token chunks improve Recall@50 by approximately 1 pp and hybrid+rerank boosts Precision@10 by approximately 7 pp; the verifier raises TRACe Utilization above 0.50 and limits unsupported statements to less than 3%. All components run in containers, enforce end-to-end TLS 1.3 and AES-256. These results demonstrate that the DEREK module delivers accurate, traceable, and production-ready document QA with minimal operational overhead. The module is designed to meet enterprise demands for secure, auditable, and context-faithful retrieval, providing a reliable baseline for high-stakes domains such as legal and finance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes