CTourLLM: Enhancing LLMs with Chinese Tourism Knowledge
This addresses a domain-specific problem for users needing accurate Chinese tourism information, but it is incremental as it applies existing fine-tuning methods to new data.
The paper tackled the problem of large language models lacking tourism knowledge for Chinese attractions and travel planning by constructing a fine-tuning dataset (Cultour) and training CTourLLM, which outperformed ChatGPT with improvements of 1.21 in BLEU-1 and 1.54 in Rouge-L.
Recently, large language models (LLMs) have demonstrated their effectiveness in various natural language processing (NLP) tasks. However, the lack of tourism knowledge limits the performance of LLMs in tourist attraction presentations and travel planning. To address this challenge, we constructed a supervised fine-tuning dataset for the Chinese culture and tourism domain, named Cultour. This dataset consists of three parts: tourism knowledge base data, travelogues data, and tourism QA data. Additionally, we propose CTourLLM, a Qwen-based model supervised fine-tuned with Cultour, to improve the quality of information about attractions and travel planning. To evaluate the performance of CTourLLM, we proposed a human evaluation criterion named RRA (Relevance, Readability, Availability), and employed both automatic and human evaluation. The experimental results demonstrate that CTourLLM outperforms ChatGPT, achieving an improvement of 1.21 in BLEU-1 and 1.54 in Rouge-L, thereby validating the effectiveness of the response outcomes. Our proposed Cultour is accessible at https://github.com/mrweiqk/Cultour.