SOC-PHCLSIDATA-ANFeb 19, 2013

Complex networks analysis of language complexity

arXiv:1302.4490v137 citations
Originality Incremental advance
AI Analysis

This work addresses the need for rapid screening of text complexity to improve accessibility for readers with limited reading ability, though it is incremental compared to methods using deep linguistic knowledge.

The paper tackled the problem of quantifying textual complexity by representing texts as co-occurrence networks and found that topological regularity negatively correlates with complexity, with less complex texts showing decreased distances between concepts. Using multivariate pattern recognition, they distinguished original texts from simplified versions, achieving easier distinction for strongly simplified texts with metrics like node strength and shortest paths.

Methods from statistical physics, such as those involving complex networks, have been increasingly used in quantitative analysis of linguistic phenomena. In this paper, we represented pieces of text with different levels of simplification in co-occurrence networks and found that topological regularity correlated negatively with textual complexity. Furthermore, in less complex texts the distance between concepts, represented as nodes, tended to decrease. The complex networks metrics were treated with multivariate pattern recognition techniques, which allowed us to distinguish between original texts and their simplified versions. For each original text, two simplified versions were generated manually with increasing number of simplification operations. As expected, distinction was easier for the strongly simplified versions, where the most relevant metrics were node strength, shortest paths and diversity. Also, the discrimination of complex texts was improved with higher hierarchical network metrics, thus pointing to the usefulness of considering wider contexts around the concepts. Though the accuracy rate in the distinction was not as high as in methods using deep linguistic knowledge, the complex network approach is still useful for a rapid screening of texts whenever assessing complexity is essential to guarantee accessibility to readers with limited reading ability

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes