Maurício Cecílio Magnaguagno

7.6IRMay 23

RAGe: A Retrieval-Augmented Generation Evaluation Framework

Larissa Guder, João Pedro de Moura, Arthur Accorsi et al.

Deploying Large Language Model (LLM) applications, particularly those relying on Retrieval-Augmented Generation (RAG), remains challenging due to high computational demands, outdated knowledge bases, and the need to manually select optimal pipeline components. In this work, we propose a modular framework for benchmarking and guiding the efficient development of RAG applications by focusing on resource telemetry and component recommendation, suggesting the best components for a domain-specific dataset. Our approach leverages core techniques in LLM applications, including document chunking, vector databases, embedding models, and retrievers, to evaluate trade-offs among accuracy, efficiency, and scalability. By directly correlating retrieval and generation quality with underlying hardware constraints, RAGe supports researchers to identify the most effective, domain-specific RAG setups for their specific operational needs, facilitating rapid prototyping even on consumer-grade hardware.

AIJul 1, 2022

HyperTensioN and Total-order Forward Decomposition optimizations

Maurício Cecílio Magnaguagno, Felipe Meneguzzi, Lavindra de Silva

Hierarchical Task Networks (HTN) planners generate plans using a decomposition process with extra domain knowledge to guide search towards a planning task. While domain experts develop HTN descriptions, they may repeatedly describe the same preconditions, or methods that are rarely used or possible to be decomposed. By leveraging a three-stage compiler design we can easily support more language descriptions and preprocessing optimizations that when chained can greatly improve runtime efficiency in such domains. In this paper we evaluate such optimizations with the HyperTensioN HTN planner, used in the HTN IPC 2020.

Maurício Cecílio Magnaguagno

2 Papers