TraSE: Towards Tackling Authorial Style from a Cognitive Science Perspective
This work addresses stylistic analysis for tasks like authorship attribution and forensic analysis, offering a significant improvement over existing methods.
The paper tackles the problem of authorial style analysis by introducing TraSE, a novel feature representation that addresses issues like topic influence and lack of discriminability for many authors, achieving 90% attribution accuracy in experiments with over 27,000 authors and 1.4 million samples.
Stylistic analysis of text is a key task in research areas ranging from authorship attribution to forensic analysis and personality profiling. The existing approaches for stylistic analysis are plagued by issues like topic influence, lack of discriminability for large number of authors and the requirement for large amounts of diverse data. In this paper, the source of these issues are identified along with the necessity for a cognitive perspective on authorial style in addressing them. A novel feature representation, called Trajectory-based Style Estimation (TraSE), is introduced to support this purpose. Authorship attribution experiments with over 27,000 authors and 1.4 million samples in a cross-domain scenario resulted in 90% attribution accuracy suggesting that the feature representation is immune to such negative influences and an excellent candidate for stylistic analysis. Finally, a qualitative analysis is performed on TraSE using physical human characteristics, like age, to validate its claim on capturing cognitive traits.