Clinical Concept Extraction for Document-Level Coding
This work addresses clinical coding from notes for healthcare applications, but it is incremental as it builds on existing approaches without achieving improvements.
The paper tackled the problem of extracting document-level clinical codes from clinical notes by proposing two novel syntheses that bridge information extraction and supervised machine learning, but the resulting concepts did not yield performance gains on the task.
The text of clinical notes can be a valuable source of patient information and clinical assessments. Historically, the primary approach for exploiting clinical notes has been information extraction: linking spans of text to concepts in a detailed domain ontology. However, recent work has demonstrated the potential of supervised machine learning to extract document-level codes directly from the raw text of clinical notes. We propose to bridge the gap between the two approaches with two novel syntheses: (1) treating extracted concepts as features, which are used to supplement or replace the text of the note; (2) treating extracted concepts as labels, which are used to learn a better representation of the text. Unfortunately, the resulting concepts do not yield performance gains on the document-level clinical coding task. We explore possible explanations and future research directions.