CLAug 16, 2016

Authorship clustering using multi-headed recurrent neural networks

arXiv:1608.04485v128 citations

Originality Incremental advance

AI Analysis

This addresses authorship attribution for short, disparate documents, but it is incremental as it builds on existing neural network methods.

The paper tackled the problem of clustering documents by unknown authors using a multi-headed recurrent neural network to model language and measure similarity, achieving statistically significant predictions but struggling with high-accuracy clustering.

A recurrent neural network that has been trained to separately model the language of several documents by unknown authors is used to measure similarity between the documents. It is able to find clues of common authorship even when the documents are very short and about disparate topics. While it is easy to make statistically significant predictions regarding authorship, it is difficult to group documents into definite clusters with high accuracy.

View on arXiv PDF

Similar