AICLMar 7, 2024

Wiki-TabNER: Integrating Named Entity Recognition into Wikipedia Tables

arXiv:2403.04577v21 citationsh-index: 13SIGIR
AI Analysis

This work addresses the need for more realistic datasets for named entity recognition in tables, benefiting researchers in NLP and table understanding, though it is incremental as it builds on existing benchmarks.

The authors tackled the problem of overly simplified table interpretation datasets by creating Wiki-TabNER, a more challenging dataset with complex tables and multiple entities per cell, annotated using DBpedia classes, and they proposed a prompting framework for evaluating large language models on this task, with qualitative analysis revealing model challenges and dataset limitations.

Interest in solving table interpretation tasks has grown over the years, yet it still relies on existing datasets that may be overly simplified. This is potentially reducing the effectiveness of the dataset for thorough evaluation and failing to accurately represent tables as they appear in the real-world. To enrich the existing benchmark datasets, we extract and annotate a new, more challenging dataset. The proposed Wiki-TabNER dataset features complex tables containing several entities per cell, with named entities labeled using DBpedia classes. This dataset is specifically designed to address named entity recognition (NER) task within tables, but it can also be used as a more challenging dataset for evaluating the entity linking task. In this paper we describe the distinguishing features of the Wiki-TabNER dataset and the labeling process. In addition, we propose a prompting framework for evaluating the new large language models on the within tables NER task. Finally, we perform qualitative analysis to gain insights into the challenges encountered by the models and to understand the limitations of the proposed~dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes