CLJun 16, 2020

How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

Steffen Eger, Johannes Daxenberger, Iryna Gurevych

arXiv:2006.09109v231.11003 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of evaluating sentence embeddings in low-resource languages, which is important for multilingual NLP but is an incremental methodological study.

The paper investigates how to design probing tasks for sentence embeddings in low-resource languages, showing that structural design choices like dataset size and classifier type significantly affect outcomes, and finds that English probing results don't transfer to other languages.

Sentence encoders map sentences to real valued vectors for use in downstream applications. To peek into these representations - e.g., to increase interpretability of their results - probing tasks have been designed which query them for linguistic knowledge. However, designing probing tasks for lesser-resourced languages is tricky, because these often lack large-scale annotated data or (high-quality) dependency parsers as a prerequisite of probing task design in English. To investigate how to probe sentence embeddings in such cases, we investigate sensitivity of probing task results to structural design choices, conducting the first such large scale study. We show that design choices like size of the annotated probing dataset and type of classifier used for evaluation do (sometimes substantially) influence probing outcomes. We then probe embeddings in a multilingual setup with design choices that lie in a 'stable region', as we identify for English, and find that results on English do not transfer to other languages. Fairer and more comprehensive sentence-level probing evaluation should thus be carried out on multiple languages in the future.

View on arXiv PDF Code

Similar