SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool
This tool addresses the problem of maintaining privacy and efficiency in generative AI for researchers and practitioners, though it is incremental as it builds on existing retrieval-augmented generation concepts.
The authors tackled the integration of private data into public generative AI systems by developing SimplyRetrieve, an open-source tool that implements a Retrieval-Centric Generation approach without requiring model fine-tuning, resulting in a lightweight and user-friendly platform for the machine learning community.
Large Language Model (LLM) based Generative AI systems have seen significant progress in recent years. Integrating a knowledge retrieval architecture allows for seamless integration of private data into publicly available Generative AI systems using pre-trained LLM without requiring additional model fine-tuning. Moreover, Retrieval-Centric Generation (RCG) approach, a promising future research direction that explicitly separates roles of LLMs and retrievers in context interpretation and knowledge memorization, potentially leads to more efficient implementation. SimplyRetrieve is an open-source tool with the goal of providing a localized, lightweight, and user-friendly interface to these sophisticated advancements to the machine learning community. SimplyRetrieve features a GUI and API based RCG platform, assisted by a Private Knowledge Base Constructor and a Retrieval Tuning Module. By leveraging these capabilities, users can explore the potential of RCG for improving generative AI performance while maintaining privacy standards. The tool is available at https://github.com/RCGAI/SimplyRetrieve with an MIT license.