Correct after Answer: Enhancing Multi-Span Question Answering with Post-Processing Method
This work addresses a specific issue in MSQA for NLP researchers, but it is incremental as it builds on existing methods with a post-processing approach.
The paper tackles the problem of incorrect predictions in Multi-Span Question Answering by proposing the ACC framework, which uses post-processing to classify and correct predictions, resulting in significant improvements in Exact Match scores on multiple datasets.
Multi-Span Question Answering (MSQA) requires models to extract one or multiple answer spans from a given context to answer a question. Prior work mainly focuses on designing specific methods or applying heuristic strategies to encourage models to predict more correct predictions. However, these models are trained on gold answers and fail to consider the incorrect predictions. Through a statistical analysis, we observe that models with stronger abilities do not predict less incorrect predictions compared with other models. In this work, we propose Answering-Classifying-Correcting (ACC) framework, which employs a post-processing strategy to handle incorrect predictions. Specifically, the ACC framework first introduces a classifier to classify the predictions into three types and exclude "wrong predictions", then introduces a corrector to modify "partially correct predictions". Experiments on several MSQA datasets show that ACC framework significantly improves the Exact Match (EM) scores, and further analysis demostrates that ACC framework efficiently reduces the number of incorrect predictions, improving the quality of predictions.