CLNov 12, 2013

Authorship Attribution Using Word Network Features

arXiv:1311.2978v111 citations

Originality Synthesis-oriented

AI Analysis

This work addresses authorship identification for forensic or literary analysis, but it appears incremental as it builds on existing complex network ideas.

The authors tackled authorship attribution by proposing novel features derived from word network representations of text, achieving promising results across three datasets.

In this paper, we explore a set of novel features for authorship attribution of documents. These features are derived from a word network representation of natural language text. As has been noted in previous studies, natural language tends to show complex network structure at word level, with low degrees of separation and scale-free (power law) degree distribution. There has also been work on authorship attribution that incorporates ideas from complex networks. The goal of our paper is to explore properties of these complex networks that are suitable as features for machine-learning-based authorship attribution of documents. We performed experiments on three different datasets, and obtained promising results.

View on arXiv PDF

Similar