CLJun 16, 2025

K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean

arXiv:2506.13513v13 citationsh-index: 1ACL
Originality Incremental advance
AI Analysis

This addresses the problem of outdated and labor-intensive datasets for detoxifying implicitly offensive language, particularly in Korean, with incremental improvements in automation and applicability.

The paper tackles the challenge of creating up-to-date paired datasets for language detoxification by introducing K/DA, an automated pipeline that generates offensive language with implicit offensiveness and trend-aligned slang, resulting in a dataset with high pair consistency and greater implicit offensiveness compared to existing Korean datasets, and enabling effective training of a high-performing detoxification model.

Language detoxification involves removing toxicity from offensive language. While a neutral-toxic paired dataset provides a straightforward approach for training detoxification models, creating such datasets presents several challenges: i) the need for human annotation to build paired data, and ii) the rapid evolution of offensive terms, rendering static datasets quickly outdated. To tackle these challenges, we introduce an automated paired data generation pipeline, called K/DA. This pipeline is designed to generate offensive language with implicit offensiveness and trend-aligned slang, making the resulting dataset suitable for detoxification model training. We demonstrate that the dataset generated by K/DA exhibits high pair consistency and greater implicit offensiveness compared to existing Korean datasets, and also demonstrates applicability to other languages. Furthermore, it enables effective training of a high-performing detoxification model with simple instruction fine-tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes