CLAug 11, 2023

Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures

arXiv:2308.06199v12 citationsh-index: 40
Originality Synthesis-oriented
AI Analysis

This work addresses the labor-intensive manual analysis of patient-reported outcome data for healthcare researchers, but it is incremental as it applies existing weakly supervised methods to a new domain.

The paper tackled the problem of analyzing free text comments in patient-reported outcome measures (PROMs) by applying five weakly supervised text classification techniques to identify health-related quality of life themes in colorectal cancer patient data, resulting in moderate performance with variations between themes.

Free text comments (FTC) in patient-reported outcome measures (PROMs) data are typically analysed using manual methods, such as content analysis, which is labour-intensive and time-consuming. Machine learning analysis methods are largely unsupervised, necessitating post-analysis interpretation. Weakly supervised text classification (WSTC) can be a valuable method of analysis to classify domain-specific text data in which there is limited labelled data. In this paper, we apply five WSTC techniques to FTC in PROMs data to identify health-related quality of life (HRQoL) themes reported by colorectal cancer patients. The WSTC methods label all the themes mentioned in the FTC. The results showed moderate performance on the PROMs data, mainly due to the precision of the models, and variation between themes. Evaluation of the classification performance illustrated the potential and limitations of keyword based WSTC to label PROMs FTC when labelled data is limited.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes