LGAICLOct 23, 2023

Meta learning with language models: Challenges and opportunities in the classification of imbalanced text

arXiv:2310.15019v21 citationsh-index: 17
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of imbalanced text classification for content moderation, but it appears incremental as it builds on existing meta learning and threshold-moving methods.

The paper tackles the challenge of detecting out-of-policy speech (OOPS) content by proposing a meta learning technique combined with threshold-moving to improve performance on imbalanced datasets, showing statistically significant advantages.

Detecting out of policy speech (OOPS) content is important but difficult. While machine learning is a powerful tool to tackle this challenging task, it is hard to break the performance ceiling due to factors like quantity and quality limitations on training data and inconsistencies in OOPS definition and data labeling. To realize the full potential of available limited resources, we propose a meta learning technique (MLT) that combines individual models built with different text representations. We analytically show that the resulting technique is numerically stable and produces reasonable combining weights. We combine the MLT with a threshold-moving (TM) technique to further improve the performance of the combined predictor on highly-imbalanced in-distribution and out-of-distribution datasets. We also provide computational results to show the statistically significant advantages of the proposed MLT approach. All authors contributed equally to this work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes