CL LGFeb 20, 2020

Federated pretraining and fine tuning of BERT using clinical notes from multiple silos

arXiv:2002.08562v13.738 citations

Originality Incremental advance

AI Analysis

This addresses privacy and regulatory barriers in healthcare NLP by enabling collaborative model training across data silos.

The paper tackled the challenge of training BERT models on clinical text data from multiple institutions without sharing data due to privacy concerns, by demonstrating successful federated pretraining and fine-tuning.

Large scale contextual representation models, such as BERT, have significantly advanced natural language processing (NLP) in recently years. However, in certain area like healthcare, accessing diverse large scale text data from multiple institutions is extremely challenging due to privacy and regulatory reasons. In this article, we show that it is possible to both pretrain and fine tune BERT models in a federated manner using clinical texts from different silos without moving the data.

View on arXiv PDF

Similar