Cost-sensitive Regularization for Label Confusion-aware Event Detection
This work addresses label confusion in event detection, which is a domain-specific problem for NLP researchers, and is incremental as it builds on existing models with a regularization technique.
The paper tackled the problem of label confusion in supervised event detection, where mislabeling often occurs between a small set of confusing type pairs, and proposed cost-sensitive regularization to focus training on these pairs, resulting in significant performance improvements on TAC-KBP 2017 datasets in English and Chinese.
In supervised event detection, most of the mislabeling occurs between a small number of confusing type pairs, including trigger-NIL pairs and sibling sub-types of the same coarse type. To address this label confusion problem, this paper proposes cost-sensitive regularization, which can force the training procedure to concentrate more on optimizing confusing type pairs. Specifically, we introduce a cost-weighted term into the training loss, which penalizes more on mislabeling between confusing label pairs. Furthermore, we also propose two estimators which can effectively measure such label confusion based on instance-level or population-level statistics. Experiments on TAC-KBP 2017 datasets demonstrate that the proposed method can significantly improve the performances of different models in both English and Chinese event detection.