CLJan 29, 2024

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

Bolei Ma, Ercong Nie, Shuzhou Yuan, Helmut Schmid, Michael Färber, Frauke Kreuter, Hinrich Schütze

arXiv:2401.16589v226.9105 citationsh-index: 70Has CodeEACL

Originality Incremental advance

AI Analysis

This addresses a gap in prompt-based methods for cross-lingual sequence labeling, offering a novel approach that improves zero-shot transfer for tasks like NER and POS tagging, though it is incremental as it builds on existing prompt-tuning frameworks.

The paper tackles the challenge of applying prompt-based methods to token-level sequence labeling tasks like Named Entity Recognition and Part-of-Speech tagging in cross-lingual settings, proposing ToPro which decomposes sentences into tokens with individual prompts and achieves state-of-the-art performance with the mT5 model, outperforming existing methods especially for languages typologically different from English.

Prompt-based methods have been successfully applied to multilingual pretrained language models for zero-shot cross-lingual understanding. However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging. In this paper, we propose Token-Level Prompt Decomposition (ToPro), which facilitates the prompt-based method for token-level sequence labeling tasks. The ToPro method decomposes an input sentence into single tokens and applies one prompt template to each token. Our experiments on multilingual NER and POS tagging datasets demonstrate that ToPro-based fine-tuning outperforms Vanilla fine-tuning and Prompt-Tuning in zero-shot cross-lingual transfer, especially for languages that are typologically different from the source language English. Our method also attains state-of-the-art performance when employed with the mT5 model. Besides, our exploratory study in multilingual large language models shows that ToPro performs much better than the current in-context learning method. Overall, the performance improvements show that ToPro could potentially serve as a novel and simple benchmarking method for sequence labeling tasks.

View on arXiv PDF Code

Similar