CL LGJan 26, 2021

Medical Segment Coloring of Clinical Notes

arXiv:2101.11477v10.2

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable medical note segmentation for healthcare professionals, though it is incremental as it builds on existing deep learning methods for text classification.

The paper tackles the problem of automatically identifying and color-coding segments in clinical notes according to ICD-9 categories, achieving a 64% micro-average F1-score for document multi-labeling compared to 52.4% for a baseline method, with segment coloring evaluated at a median accuracy of 83.3% by medical practitioners.

This paper proposes a deep learning-based method to identify the segments of a clinical note corresponding to ICD-9 broad categories which are further color-coded with respect to 17 ICD-9 categories. The proposed Medical Segment Colorer (MSC) architecture is a pipeline framework that works in three stages: (1) word categorization, (2) phrase allocation, and (3) document classification. MSC uses gated recurrent unit neural networks (GRUs) to map from an input document to word multi-labels to phrase allocations, and uses statistical median to map phrase allocation to document multi-label. We compute variable length segment coloring from overlapping phrase allocation probabilities. These cross-level bidirectional contextual links identify adaptive context and then produce segment coloring. We train and evaluate MSC using the document labeled MIMIC-III clinical notes. Training is conducted solely using document multi-labels without any information on phrases, segments, or words. In addition to coloring a clinical note, MSC generates as byproducts document multi-labeling and word tagging -- creation of ICD9 category keyword lists based on segment coloring. Performance comparison of MSC byproduct document multi-labels versus methods whose purpose is to produce justifiable document multi-labels is 64% vs 52.4% micro-average F1-score against the CAML (CNN attention multi label) method. For evaluation of MSC segment coloring results, medical practitioners independently assigned the colors to broad ICD9 categories given a sample of 40 colored notes and a sample of 50 words related to each category based on the word tags. Binary scoring of this evaluation has a median value of 83.3% and mean of 63.7%.

View on arXiv PDF

Similar