CLJan 12, 2024

Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection

arXiv:2401.06752v16 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses document provenance and authentication challenges posed by AI-generated text, offering incremental advancements in stylometry for multi-authored documents.

The paper tackles authorship detection in multi-authored documents by proposing a merit-based fusion framework that integrates NLP algorithms and weight optimization, achieving significant improvements over existing solutions on benchmark tasks like classification and author change detection.

In recent years, the increasing use of Artificial Intelligence based text generation tools has posed new challenges in document provenance, authentication, and authorship detection. However, advancements in stylometry have provided opportunities for automatic authorship and author change detection in multi-authored documents using style analysis techniques. Style analysis can serve as a primary step toward document provenance and authentication through authorship detection. This paper investigates three key tasks of style analysis: (i) classification of single and multi-authored documents, (ii) single change detection, which involves identifying the point where the author switches, and (iii) multiple author-switching detection in multi-authored documents. We formulate all three tasks as classification problems and propose a merit-based fusion framework that integrates several state-of-the-art natural language processing (NLP) algorithms and weight optimization techniques. We also explore the potential of special characters, which are typically removed during pre-processing in NLP applications, on the performance of the proposed methods for these tasks by conducting extensive experiments on both cleaned and raw datasets. Experimental results demonstrate significant improvements over existing solutions for all three tasks on a benchmark dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes