CLAug 21, 2020

Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Aakanksha Naik, Jill Lehman, Carolyn Rose

arXiv:2008.09266v128.8802 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of domain adaptation for event extraction in medical texts, which is incremental as it applies existing techniques to new datasets.

The paper tackled adapting event extractors to medical domains without labeled data by aligning marginal distributions, creating datasets from clinical notes and doctor-patient conversations. The best models achieved F1 scores of 70.0 and 72.9 on these domains using no target domain labels.

We tackle the task of adapting event extractors to new domains without labeled data, by aligning the marginal distributions of source and target domains. As a testbed, we create two new event extraction datasets using English texts from two medical domains: (i) clinical notes, and (ii) doctor-patient conversations. We test the efficacy of three marginal alignment techniques: (i) adversarial domain adaptation (ADA), (ii) domain adaptive fine-tuning (DAFT), and (iii) a novel instance weighting technique based on language model likelihood scores (LIW). LIW and DAFT improve over a no-transfer BERT baseline on both domains, but ADA only improves on clinical notes. Deeper analysis of performance under different types of shifts (e.g., lexical shift, semantic shift) reveals interesting variations among models. Our best-performing models reach F1 scores of 70.0 and 72.9 on notes and conversations respectively, using no labeled data from target domains.

View on arXiv PDF

Similar