Juan Luis Esteban

CL
5papers
699citations
Novelty36%
AI Score25

5 Papers

DSJun 14, 2022
The Maximum Linear Arrangement Problem for trees under projectivity and planarity

Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho

A linear arrangement is a mapping $π$ from the $n$ vertices of a graph $G$ to $n$ distinct consecutive integers. Linear arrangements can be represented by drawing the vertices along a horizontal line and drawing the edges as semicircles above said line. In this setting, the length of an edge is defined as the absolute value of the difference between the positions of its two vertices in the arrangement, and the cost of an arrangement as the sum of all edge lengths. Here we study two variants of the Maximum Linear Arrangement problem (MaxLA), which consists of finding an arrangement that maximizes the cost. In the planar variant for free trees, vertices have to be arranged in such a way that there are no edge crossings. In the projective variant for rooted trees, arrangements have to be planar and the root of the tree cannot be covered by any edge. In this paper we present algorithms that are linear in time and space to solve planar and projective MaxLA for trees. We also prove several properties of maximum projective and planar arrangements, and show that caterpillar trees maximize planar MaxLA over all trees of a fixed size thereby generalizing a previous extremal result on trees.

CLDec 5, 2021Code
The Linear Arrangement Library. A new tool for research on syntactic dependency structures

Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho

The new and growing field of Quantitative Dependency Syntax has emerged at the crossroads between Dependency Syntax and Quantitative Linguistics. One of the main concerns in this field is the statistical patterns of syntactic dependency structures. These structures, grouped in treebanks, are the source for statistical analyses in these and related areas; dozens of scores devised over the years are the tools of a new industry to search for patterns and perform other sorts of analyses. The plethora of such metrics and their increasing complexity require sharing the source code of the programs used to perform such analyses. However, such code is not often shared with the scientific community or is tested following unknown standards. Here we present a new open-source tool, the Linear Arrangement Library (LAL), which caters to the needs of, especially, inexperienced programmers. This tool enables the calculation of these metrics on single syntactic dependency structures, treebanks, and collection of treebanks, grounded on ease of use and yet with great flexibility. LAL has been designed to be efficient, easy to use (while satisfying the needs of all levels of programming expertise), reliable (thanks to thorough testing), and to unite research from different traditions, geographic areas, and research fields.

DSFeb 5, 2021
Minimum projective linearizations of trees in linear time

Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho

The Minimum Linear Arrangement problem (MLA) consists of finding a mapping $π$ from vertices of a graph to distinct integers that minimizes $\sum_{\{u,v\}\in E}|π(u) - π(v)|$. In that setting, vertices are often assumed to lie on a horizontal line and edges are drawn as semicircles above said line. For trees, various algorithms are available to solve the problem in polynomial time in $n=|V|$. There exist variants of the MLA in which the arrangements are constrained. Iordanskii, and later Hochberg and Stallmann (HS), put forward $O(n)$-time algorithms that solve the problem when arrangements are constrained to be planar (also known as one-page book embeddings). We also consider linear arrangements of rooted trees that are constrained to be projective (planar embeddings where the root is not covered by any edge). Gildea and Temperley (GT) sketched an algorithm for projective arrangements which they claimed runs in $O(n)$ but did not provide any justification of its cost. In contrast, Park and Levy claimed that GT's algorithm runs in $O(n \log d_{max})$ where $d_{max}$ is the maximum degree but did not provide sufficient detail. Here we correct an error in HS's algorithm for the planar case, show its relationship with the projective case, and derive simple algorithms for the projective and planar cases that run without a doubt in $O(n)$ time.

CLJul 30, 2020
The optimality of syntactic dependency distances

Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban et al.

It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence. We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions, i.e. that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a new hierarchical ranking of languages by their degree of optimization. The new score has implications for various fields of language research (dependency linguistics, typology, historical linguistics, clinical linguistics and cognitive science). Finally, the principles behind the design of the score have implications for network science.

DMJun 24, 2020
Bounds of the sum of edge lengths in linear arrangements of trees

Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban

A fundamental problem in network science is the normalization of the topological or physical distance between vertices, that requires understanding the range of variation of the unnormalized distances. Here we investigate the limits of the variation of the physical distance in linear arrangements of the vertices of trees. In particular, we investigate various problems on the sum of edge lengths in trees of a fixed size: the minimum and the maximum value of the sum for specific trees, the minimum and the maximum in classes of trees (bistar trees and caterpillar trees) and finally the minimum and the maximum for any tree. We establish some foundations for research on optimality scores for spatial networks in one dimension.