Russian Texts Detoxification with Levenshtein Editing
This work addresses text detoxification for Russian language users, representing an incremental improvement over existing methods.
The paper tackled the problem of detoxifying Russian texts by using a two-step tagging-based model, achieving the best style transfer accuracy in the RUSSE Detox shared task.
Text detoxification is a style transfer task of creating neutral versions of toxic texts. In this paper, we use the concept of text editing to build a two-step tagging-based detoxification model using a parallel corpus of Russian texts. With this model, we achieved the best style transfer accuracy among all models in the RUSSE Detox shared task, surpassing larger sequence-to-sequence models.