Team Fusion@ SU@ BC8 SympTEMIST track: transformer-based approach for symptom recognition and linking
This work addresses symptom identification in medical data, but it is incremental as it applies existing transformer methods to a specific domain task.
The paper tackled symptom recognition and linking in medical texts by fine-tuning a RoBERTa-based model with BiLSTM and CRF layers for NER and using SapBERT for entity linking, achieving results where the knowledge base choice had the highest impact on accuracy.
This paper presents a transformer-based approach to solving the SympTEMIST named entity recognition (NER) and entity linking (EL) tasks. For NER, we fine-tune a RoBERTa-based (1) token-level classifier with BiLSTM and CRF layers on an augmented train set. Entity linking is performed by generating candidates using the cross-lingual SapBERT XLMR-Large (2), and calculating cosine similarity against a knowledge base. The choice of knowledge base proves to have the highest impact on model accuracy.