CLApr 29, 2024

Spivavtor: An Instruction Tuned Ukrainian Text Editing Model

Aman Saini, Artem Chernodub, Vipul Raheja, Vivek Kulkarni

DeepMind

arXiv:2404.18880v124.181 citationsh-index: 13UNLP

Originality Synthesis-oriented

AI Analysis

This provides a domain-specific tool for Ukrainian language processing, though it is incremental as an adaptation of an existing English model.

The authors tackled the lack of Ukrainian text editing models by creating Spivavtor, an instruction-tuned model adapted from the English CoEdIT model, which demonstrated superior performance on tasks like grammatical error correction and text simplification.

We introduce Spivavtor, a dataset, and instruction-tuned models for text editing focused on the Ukrainian language. Spivavtor is the Ukrainian-focused adaptation of the English-only CoEdIT model. Similar to CoEdIT, Spivavtor performs text editing tasks by following instructions in Ukrainian. This paper describes the details of the Spivavtor-Instruct dataset and Spivavtor models. We evaluate Spivavtor on a variety of text editing tasks in Ukrainian, such as Grammatical Error Correction (GEC), Text Simplification, Coherence, and Paraphrasing, and demonstrate its superior performance on all of them. We publicly release our best-performing models and data as resources to the community to advance further research in this space.

View on arXiv PDF

Similar