CLMay 8, 2023

A Unified Evaluation Framework for Novelty Detection and Accommodation in NLP with an Instantiation in Authorship Attribution

arXiv:2305.05079v1223 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for robust NLP systems in real-world settings where novel classes appear, though it is incremental as it focuses on establishing a framework rather than solving the problem.

The authors tackled the problem of handling novel instances in NLP by introducing 'NoveltyTask', a multi-stage framework for evaluating novelty detection and accommodation, and instantiated it with authorship attribution using a dataset of 250k instances across 200 authors, finding that baseline methods achieved low performance, indicating the task's challenge.

State-of-the-art natural language processing models have been shown to achieve remarkable performance in 'closed-world' settings where all the labels in the evaluation set are known at training time. However, in real-world settings, 'novel' instances that do not belong to any known class are often observed. This renders the ability to deal with novelties crucial. To initiate a systematic research in this important area of 'dealing with novelties', we introduce 'NoveltyTask', a multi-stage task to evaluate a system's performance on pipelined novelty 'detection' and 'accommodation' tasks. We provide mathematical formulation of NoveltyTask and instantiate it with the authorship attribution task that pertains to identifying the correct author of a given text. We use Amazon reviews corpus and compile a large dataset (consisting of 250k instances across 200 authors/labels) for NoveltyTask. We conduct comprehensive experiments and explore several baseline methods for the task. Our results show that the methods achieve considerably low performance making the task challenging and leaving sufficient room for improvement. Finally, we believe our work will encourage research in this underexplored area of dealing with novelties, an important step en route to developing robust systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes