CLAug 19, 2019

Question Answering based Clinical Text Structuring Using Pre-trained Language Model

Jiahui Qiu, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Jinlin Liu, Jing Sun

arXiv:1908.06606v20.34 citations

Originality Incremental advance

AI Analysis

This work addresses dataset scarcity and error propagation in clinical text structuring for medical research, though it is incremental as it builds on existing pre-trained models.

The paper tackles clinical text structuring by proposing a question answering-based approach to unify tasks and share datasets, and introduces a model incorporating domain-specific features into a pre-trained language model, showing effectiveness on Chinese pathology reports with competitive performance against baselines.

Clinical text structuring is a critical and fundamental task for clinical research. Traditional methods such as taskspecific end-to-end models and pipeline models usually suffer from the lack of dataset and error propagation. In this paper, we present a question answering based clinical text structuring (QA-CTS) task to unify different specific tasks and make dataset shareable. A novel model that aims to introduce domain-specific features (e.g., clinical named entity information) into pre-trained language model is also proposed for QA-CTS task. Experimental results on Chinese pathology reports collected from Ruijing Hospital demonstrate our presented QA-CTS task is very effective to improve the performance on specific tasks. Our proposed model also competes favorably with strong baseline models in specific tasks.

View on arXiv PDF

Similar