Speeding Up Question Answering Task of Language Models via Inverted Index
This work addresses efficiency issues for developers integrating LLMs into real-world conversational agents, though it is incremental as it applies an existing indexing method to a specific domain.
The paper tackled the problem of high resource consumption in large language models (LLMs) for question-answering tasks by using an inverted index to improve efficiency, resulting in a 97.44% reduction in average response time and a 0.23 increase in average BLEU score.
Natural language processing applications, such as conversational agents and their question-answering capabilities, are widely used in the real world. Despite the wide popularity of large language models (LLMs), few real-world conversational agents take advantage of LLMs. Extensive resources consumed by LLMs disable developers from integrating them into end-user applications. In this study, we leverage an inverted indexing mechanism combined with LLMs to improve the efficiency of question-answering models for closed-domain questions. Our experiments show that using the index improves the average response time by 97.44%. In addition, due to the reduced search scope, the average BLEU score improved by 0.23 while using the inverted index.