CLAILGAug 16, 2022

DICE: Data-Efficient Clinical Event Extraction with Generative Models

arXiv:2208.07989v2230 citationsh-index: 50
Originality Incremental advance
AI Analysis

This work addresses a critical bottleneck in clinical NLP by enabling more efficient event extraction with less data, though it is incremental as it builds on existing generative and contrastive learning methods.

The paper tackles the problem of clinical event extraction, which is challenging due to limited training data and domain-specific terminologies, by introducing DICE, a generative model that achieves state-of-the-art performance, particularly in low-data settings, as demonstrated on clinical and news domain datasets.

Event extraction for the clinical domain is an under-explored research area. The lack of training data along with the high volume of domain-specific terminologies with vague entity boundaries makes the task especially challenging. In this paper, we introduce DICE, a robust and data-efficient generative model for clinical event extraction. DICE frames event extraction as a conditional generation problem and introduces a contrastive learning objective to accurately decide the boundaries of biomedical mentions. DICE also trains an auxiliary mention identification task jointly with event extraction tasks to better identify entity mention boundaries, and further introduces special markers to incorporate identified entity mentions as trigger and argument candidates for their respective tasks. To benchmark clinical event extraction, we compose MACCROBAT-EE, the first clinical event extraction dataset with argument annotation, based on an existing clinical information extraction dataset MACCROBAT. Our experiments demonstrate state-of-the-art performances of DICE for clinical and news domain event extraction, especially under low data settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes