CL AIJul 19, 2022

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Oralie Cattan, Sahar Ghannay, Christophe Servan, Sophie Rosset

arXiv:2207.09152v11.14 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This work addresses the need for benchmarking on under-resourced languages like French, but it is incremental as it applies existing models to new data without introducing novel methods.

The paper benchmarks thirteen Transformer-based models on French spoken language understanding tasks (MEDIA and ATIS-FR), showing that compact models can achieve comparable results to larger ones with lower ecological impact, though this depends on the compression method.

In the last five years, the rise of the self-attentional Transformer-based architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce application conditions. Most pre-trained language models were massively studied using the English language and only a few of them were evaluated on French. In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks. Especially we benchmark thirteen well-established Transformer-based models on the two available spoken language understanding tasks for French: MEDIA and ATIS-FR. Within this framework, we show that compact models can reach comparable results to bigger ones while their ecological impact is considerably lower. However, this assumption is nuanced and depends on the considered compression method.

View on arXiv PDF

Similar