LG STSep 17, 2022

DiPietro-Hazari Kappa: A Novel Metric for Assessing Labeling Quality via Annotation

arXiv:2209.08243v11.81 citationsh-index: 3Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the need for better labeling quality assessment in data annotation for machine learning practitioners, but appears incremental as it builds on an existing measure.

The paper tackles the problem of assessing data label quality in machine learning by introducing DiPietro-Hazari Kappa, a novel metric based on Fleiss's Kappa to quantify annotator agreement above random chance, with no concrete numbers provided in the abstract.

Data is a key component of modern machine learning, but statistics for assessing data label quality remain sparse in literature. Here, we introduce DiPietro-Hazari Kappa, a novel statistical metric for assessing the quality of suggested dataset labels in the context of human annotation. Rooted in the classical Fleiss's Kappa measure of inter-annotator agreement, the DiPietro-Hazari Kappa quantifies the the empirical annotator agreement differential that was attained above random chance. We offer a thorough theoretical examination of Fleiss's Kappa before turning to our derivation of DiPietro-Hazari Kappa. Finally, we conclude with a matrix formulation and set of procedural instructions for easy computational implementation.

View on arXiv PDF Code

Similar