AdaptKeyBERT: An Attention-Based approach towards Few-Shot & Zero-Shot Domain Adaptation of KeyBERT
This work addresses the challenge of data scarcity in keyword extraction for NLP applications, offering an incremental improvement over existing methods.
The paper tackles the problem of keyword extraction with limited data by proposing AdaptKeyBERT, a pipeline for few-shot and zero-shot domain adaptation using LLM bases and regularized attention, achieving competitive performance on new benchmarks.
Keyword extraction has been an important topic for modern natural language processing. With its applications ranging from ontology generation, fact verification in summarized text, and recommendation systems. While it has had significant data-intensive applications, it is often hampered when the data set is small. Downstream training for keyword extractors is a lengthy process and requires a significant amount of data. Recently, Few-shot Learning (FSL) and Zero-Shot Learning (ZSL) have been proposed to tackle this problem. Therefore, we propose AdaptKeyBERT, a pipeline for training keyword extractors with LLM bases by incorporating the concept of regularized attention into a pre-training phase for downstream domain adaptation. As we believe our work has implications to be utilized in the pipeline of FSL/ZSL and keyword extraction, we open-source our code as well as provide the fine-tuning library of the same name AdaptKeyBERT at https://github.com/AmanPriyanshu/AdaptKeyBERT.