Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents
This addresses the challenge of handling complex documents for users relying on Q&A systems, though it appears incremental as it focuses on optimizing existing methods.
The paper tackles the problem of improving question-answering accuracy on large complex documents by applying text processing techniques to simplify the corpus before using transformer models like BERT, resulting in a statistically significant improvement in accuracy.
This paper discusses the effectiveness of various text processing techniques, their combinations, and encodings to achieve a reduction of complexity and size in a given text corpus. The simplified text corpus is sent to BERT (or similar transformer based models) for question and answering and can produce more relevant responses to user queries. This paper takes a scientific approach to determine the benefits and effectiveness of various techniques and concludes a best-fit combination that produces a statistically significant improvement in accuracy.