Measuring Global Similarity between Texts
This addresses the problem of text similarity measurement for researchers or practitioners, but appears incremental as it builds on existing approaches with a global perspective.
The authors tackled the problem of measuring similarity between texts by proposing a new measure that takes a global view, and experiments on several corpora showed it can reliably identify different global text types.
We propose a new similarity measure between texts which, contrary to the current state-of-the-art approaches, takes a global view of the texts to be compared. We have implemented a tool to compute our textual distance and conducted experiments on several corpuses of texts. The experiments show that our methods can reliably identify different global types of texts.