CLAIFeb 15, 2022

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

arXiv:2202.07791v11 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides a standardized evaluation framework for Russian NLP researchers and practitioners, but it is incremental as it builds on existing benchmark concepts.

The authors tackled the problem of evaluating Russian NLP models by releasing Russian SuperGLUE 1.1, an updated benchmark with improved datasets and tools, which includes fixes for vulnerabilities and supports recent models, though no specific performance numbers are provided.

In the last year, new neural architectures and multilingual pre-trained models have been released for Russian, which led to performance evaluation problems across a range of language understanding tasks. This paper presents Russian SuperGLUE 1.1, an updated benchmark styled after GLUE for Russian NLP models. The new version includes a number of technical, user experience and methodological improvements, including fixes of the benchmark vulnerabilities unresolved in the previous version: novel and improved tests for understanding the meaning of a word in context (RUSSE) along with reading comprehension and common sense reasoning (DaNetQA, RuCoS, MuSeRC). Together with the release of the updated datasets, we improve the benchmark toolkit based on \texttt{jiant} framework for consistent training and evaluation of NLP-models of various architectures which now supports the most recent models for Russian. Finally, we provide the integration of Russian SuperGLUE with a framework for industrial evaluation of the open-source models, MOROCCO (MOdel ResOurCe COmparison), in which the models are evaluated according to the weighted average metric over all tasks, the inference speed, and the occupied amount of RAM. Russian SuperGLUE is publicly available at https://russiansuperglue.com/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes