An Interpretable Deep Learning System for Automatically Scoring Request for Proposals
This work addresses the need for efficient and interpretable automated scoring in healthcare contract management, though it is incremental as it adapts existing essay scoring methods to a new domain.
The paper tackled the problem of automatically scoring lengthy Request for Proposals (RFP) responses in Medicaid by developing a Bi-LSTM regression model that provides interpretable insights into phrase-level impacts, with results validated through quantitative experiments and human evaluation.
The Managed Care system within Medicaid (US Healthcare) uses Request For Proposals (RFP) to award contracts for various healthcare and related services. RFP responses are very detailed documents (hundreds of pages) submitted by competing organisations to win contracts. Subject matter expertise and domain knowledge play an important role in preparing RFP responses along with analysis of historical submissions. Automated analysis of these responses through Natural Language Processing (NLP) systems can reduce time and effort needed to explore historical responses, and assisting in writing better responses. Our work draws parallels between scoring RFPs and essay scoring models, while highlighting new challenges and the need for interpretability. Typical scoring models focus on word level impacts to grade essays and other short write-ups. We propose a novel Bi-LSTM based regression model, and provide deeper insight into phrases which latently impact scoring of responses. We contend the merits of our proposed methodology using extensive quantitative experiments. We also qualitatively asses the impact of important phrases using human evaluators. Finally, we introduce a novel problem statement that can be used to further improve the state of the art in NLP based automatic scoring systems.