A Practical Incremental Learning Framework For Sparse Entity Extraction
It addresses the challenge of costly data annotation for sparse entity extraction, which is incremental as it combines existing methods like ESE and AL.
This work tackles the problem of high annotation costs and accuracy in sparse entity extraction by introducing an incremental learning framework that integrates Entity Set Expansion and Active Learning, achieving an average 85% and 45% reduction in annotation cost to reach 0.9 and 1.0 F-Scores respectively on three datasets.
This work addresses challenges arising from extracting entities from textual data, including the high cost of data annotation, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation. We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as feedback. This incremental and interactive learning framework allows for rapid annotation and subsequent extraction of sparse data while maintaining high accuracy. We evaluate our framework on three publicly available datasets and show that it drastically reduces the cost of sparse entity annotation by an average of 85% and 45% to reach 0.9 and 1.0 F-Scores respectively. Moreover, the method exhibited robust performance across all datasets.