Augmenting BERT Carefully with Underrepresented Linguistic Features
This work addresses Alzheimer's Disease detection for medical applications, but it is incremental as it builds on existing BERT-based methods with feature augmentation.
The paper tackled the problem of improving Alzheimer's Disease detection from speech transcripts by identifying and supplementing underrepresented linguistic features in BERT, resulting in a performance improvement of up to 5% over fine-tuned BERT alone.
Fine-tuned Bidirectional Encoder Representations from Transformers (BERT)-based sequence classification models have proven to be effective for detecting Alzheimer's Disease (AD) from transcripts of human speech. However, previous research shows it is possible to improve BERT's performance on various tasks by augmenting the model with additional information. In this work, we use probing tasks as introspection techniques to identify linguistic information not well-represented in various layers of BERT, but important for the AD detection task. We supplement these linguistic features in which representations from BERT are found to be insufficient with hand-crafted features externally, and show that jointly fine-tuning BERT in combination with these features improves the performance of AD classification by upto 5\% over fine-tuned BERT alone.