CLLGOct 20, 2023

A Unified View of Evaluation Metrics for Structured Prediction

Microsoft
arXiv:2310.13793v1135 citationsh-index: 60Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of inconsistent evaluation for researchers and practitioners in structured prediction, though it is incremental as it builds on existing metric concepts.

The paper tackles the problem of diverse evaluation metrics in structured prediction tasks by proposing a unified conceptual framework that represents outputs as data types and derives metrics through substructure matching and normalization, resulting in a library for creating new metrics and suggesting modifications to existing ones.

We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e.g. event and relation extraction, syntactic and semantic parsing). Our framework requires representing the outputs of these tasks as objects of certain data types, and derives metrics through matching of common substructures, possibly followed by normalization. We demonstrate how commonly used metrics for a number of tasks can be succinctly expressed by this framework, and show that new metrics can be naturally derived in a bottom-up way based on an output structure. We release a library that enables this derivation to create new metrics. Finally, we consider how specific characteristics of tasks motivate metric design decisions, and suggest possible modifications to existing metrics in line with those motivations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes