The Catalan Language CLUB
This provides a new benchmark for evaluating language models in Catalan, addressing a gap for Catalan speakers and researchers, but it is incremental as it adapts an existing framework to a new language.
The authors introduced the Catalan Language Understanding Benchmark (CLUB), a collection of datasets for evaluating language models on various NLU tasks, modeled after GLUE, to support the Catalan language in AI.
The Catalan Language Understanding Benchmark (CLUB) encompasses various datasets representative of different NLU tasks that enable accurate evaluations of language models, following the General Language Understanding Evaluation (GLUE) example. It is part of AINA and PlanTL, two public funding initiatives to empower the Catalan language in the Artificial Intelligence era.