DMNER: Biomedical Entity Recognition by Detection and Matching
This addresses the challenge of incorporating external knowledge in BNER for biomedical text mining, though it appears incremental as it builds on existing methods like SAPBERT and MRC/AutoNER.
The authors tackled biomedical named entity recognition (BNER) by proposing DMNER, a two-step framework combining entity boundary detection and matching using SAPBERT, which improved performance across supervised, distantly supervised, and multi-dataset training scenarios on 10 benchmark datasets.
Biomedical named entity recognition (BNER) serves as the foundation for numerous biomedical text mining tasks. Unlike general NER, BNER require a comprehensive grasp of the domain, and incorporating external knowledge beyond training data poses a significant challenge. In this study, we propose a novel BNER framework called DMNER. By leveraging existing entity representation models SAPBERT, we tackle BNER as a two-step process: entity boundary detection and biomedical entity matching. DMNER exhibits applicability across multiple NER scenarios: 1) In supervised NER, we observe that DMNER effectively rectifies the output of baseline NER models, thereby further enhancing performance. 2) In distantly supervised NER, combining MRC and AutoNER as span boundary detectors enables DMNER to achieve satisfactory results. 3) For training NER by merging multiple datasets, we adopt a framework similar to DS-NER but additionally leverage ChatGPT to obtain high-quality phrases in the training. Through extensive experiments conducted on 10 benchmark datasets, we demonstrate the versatility and effectiveness of DMNER.