CLAIIRLGApr 18, 2024

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes

IBM
arXiv:2404.12365v15 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This provides a user-friendly solution for NLP practitioners needing efficient multiclass classification, though it appears incremental as it builds on existing few-shot learning approaches.

The paper tackles the problem of fast and accurate few-shot text classification with many similar classes by introducing FastFit, which achieves a 3-20x improvement in training speed and better accuracy compared to existing methods.

We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of large language models via API calls, FastFit significantly improves multiclass classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds. The FastFit package is now available on GitHub and PyPi, presenting a user-friendly solution for NLP practitioners.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes