CLLGNov 4, 2022

BERT for Long Documents: A Case Study of Automated ICD Coding

arXiv:2211.02519v1292 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the challenge of applying transformer models to long documents in medical coding, though it appears incremental as it builds on existing BERT methods.

The paper tackled the problem of automated ICD coding with long documents by presenting a simple and scalable method to process long text using existing transformer models like BERT, resulting in significant improvements over previous transformer results and outperforming a prominent CNN-based method.

Transformer models have achieved great success across many NLP problems. However, previous studies in automated ICD coding concluded that these models fail to outperform some of the earlier solutions such as CNN-based models. In this paper we challenge this conclusion. We present a simple and scalable method to process long text with the existing transformer models such as BERT. We show that this method significantly improves the previous results reported for transformer models in ICD coding, and is able to outperform one of the prominent CNN-based methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes