CLMar 25, 2024

Is There a One-Model-Fits-All Approach to Information Extraction? Revisiting Task Definition Biases

Wenhao Huang, Qianyu He, Zhixu Li, Jiaqing Liang, Yanghua Xiao

arXiv:2403.16396v113.825 citationsh-index: 22Has CodeEMNLP

Originality Incremental advance

AI Analysis

It addresses a problem for information extraction researchers and practitioners by highlighting and reducing biases that affect model performance, though it is incremental in building on existing bias mitigation techniques.

The paper tackles definition bias in information extraction, which misleads models across and within datasets, and proposes a multi-stage framework that effectively mitigates this bias.

Definition bias is a negative phenomenon that can mislead models. Definition bias in information extraction appears not only across datasets from different domains but also within datasets sharing the same domain. We identify two types of definition bias in IE: bias among information extraction datasets and bias between information extraction datasets and instruction tuning datasets. To systematically investigate definition bias, we conduct three probing experiments to quantitatively analyze it and discover the limitations of unified information extraction and large language models in solving definition bias. To mitigate definition bias in information extraction, we propose a multi-stage framework consisting of definition bias measurement, bias-aware fine-tuning, and task-specific bias mitigation. Experimental results demonstrate the effectiveness of our framework in addressing definition bias. Resources of this paper can be found at https://github.com/EZ-hwh/definition-bias

View on arXiv PDF Code

Similar