CLMay 24, 2023

InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction

arXiv:2305.14659v21 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of bootstrapping information extraction templates in real-world settings where training data is expensive, such as in biomedical and legal domains, but it appears incremental as it builds on prior template-based approaches with added human interaction.

The paper tackles the problem of learning template-based information extraction without predefined schemas by using automatic question generation to induce template slots and investigates how minimal human supervision (InteractiveIE) can boost performance. Experiments on biomedical and legal documents show performance improvements over AI-only baselines, though no specific numbers are provided.

Learning template based information extraction from documents is a crucial yet difficult task. Prior template-based IE approaches assume foreknowledge of the domain templates; however, real-world IE do not have pre-defined schemas and it is a figure-out-as you go phenomena. To quickly bootstrap templates in a real-world setting, we need to induce template slots from documents with zero or minimal supervision. Since the purpose of question answering intersect with the goal of information extraction, we use automatic question generation to induce template slots from the documents and investigate how a tiny amount of a proxy human-supervision on-the-fly (termed as InteractiveIE) can further boost the performance. Extensive experiments on biomedical and legal documents, where obtaining training data is expensive, reveal encouraging trends of performance improvement using InteractiveIE over AI-only baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes