Towards Comprehensive Vietnamese Retrieval-Augmented Generation and Large Language Models
This addresses the problem of resource scarcity for researchers and developers working on Vietnamese natural language processing, though it appears incremental as it builds on existing RAG and LLM paradigms.
The paper tackled the problem of limited resources for Vietnamese language understanding and generation by developing open datasets and pre-trained models for Vietnamese Retrieval-Augmented Generation and Large Language Models, resulting in the creation of accessible tools to advance the state of the field.
This paper presents our contributions towards advancing the state of Vietnamese language understanding and generation through the development and dissemination of open datasets and pre-trained models for Vietnamese Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs).