Artificial Text Detection with Multiple Training Strategies
This addresses the detection of AI-generated texts in Russian to combat misuse in areas like fake news, though it is incremental as it builds on existing models for a specific shared task.
The paper tackled the problem of detecting artificial texts generated by specific models, proposing a DeBERTa-based method with multiple training strategies for the RuATD 2022 shared task, achieving second place in the multi-class evaluation.
As the deep learning rapidly promote, the artificial texts created by generative models are commonly used in news and social media. However, such models can be abused to generate product reviews, fake news, and even fake political content. The paper proposes a solution for the Russian Artificial Text Detection in the Dialogue shared task 2022 (RuATD 2022) to distinguish which model within the list is used to generate this text. We introduce the DeBERTa pre-trained language model with multiple training strategies for this shared task. Extensive experiments conducted on the RuATD dataset validate the effectiveness of our proposed method. Moreover, our submission ranked second place in the evaluation phase for RuATD 2022 (Multi-Class).