CLJul 17, 2024

$\textit{GeoHard}$: Towards Measuring Class-wise Hardness through Modelling Class Semantics

arXiv:2407.12512v126 citationsh-index: 16
Originality Highly original
AI Analysis

This work addresses the challenge of understanding class-wise hardness for improving task learning in natural language understanding, offering a fresh perspective on data diagnosis.

This paper tackles the problem of measuring class-specific hardness in datasets, which previous instance-level metrics overlooked, by proposing GeoHard, a method that models class geometry in semantic embedding space. The result shows that GeoHard surpasses instance-level metrics by over 59% on Pearson's correlation for measuring class-wise hardness across eight NLU datasets.

Recent advances in measuring hardness-wise properties of data guide language models in sample selection within low-resource scenarios. However, class-specific properties are overlooked for task setup and learning. How will these properties influence model learning and is it generalizable across datasets? To answer this question, this work formally initiates the concept of $\textit{class-wise hardness}$. Experiments across eight natural language understanding (NLU) datasets demonstrate a consistent hardness distribution across learning paradigms, models, and human judgment. Subsequent experiments unveil a notable challenge in measuring such class-wise hardness with instance-level metrics in previous works. To address this, we propose $\textit{GeoHard}$ for class-wise hardness measurement by modeling class geometry in the semantic embedding space. $\textit{GeoHard}$ surpasses instance-level metrics by over 59 percent on $\textit{Pearson}$'s correlation on measuring class-wise hardness. Our analysis theoretically and empirically underscores the generality of $\textit{GeoHard}$ as a fresh perspective on data diagnosis. Additionally, we showcase how understanding class-wise hardness can practically aid in improving task learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes