Henrique Ferraz de Arruda

CL
3papers
1citation
Novelty20%
AI Score30

3 Papers

SOC-PHMay 29
SF-LIFE: A Large-Scale Simulated Movement Dataset for the San Francisco Bay Area

Chanuka Algama, Taylor Anderson, Henrique Ferraz de Arruda et al.

We introduce SF-LIFE, a large-scale simulated movement dataset designed to accelerate research in transportation, mobility, and machine learning. The dataset contains 3,024,000,000,000 location records capturing complete, noise-free, multi-modality trajectories of 500,000 simulated agents observed at a 1Hz frequency navigating the San Francisco Bay Area network over a 70-day period. The data captures (1) needs-driven daily agendas of individual agents generated by an agent-based simulation of human patterns of life and (2) detailed kinematic trajectories moving agents across the OpenStreetMap representation of San Francisco using data from 40+ transit agencies across 9 counties. SF-LIFE provides unprecedented scale and detail as trajectories are based on real transit infrastructure using San Francisco General Transit Feed Specification (GTFS) data, having agent movements across multiple modalities, including bus, rail, bike, automobile, and walking. For this high-fidelity simulated representation of San Francisco, we provide (1) the full trajectory data annotated with transportation mode labels, (2) reduced-size versions of the trajectory data with reduced temporal frequency, (3) agent activity information describing the causal activity why an agent visits a place, (4) agent demographic data, and (5) the underlying OSM road network and building data. As the first dataset of its scale and level of detail, SF-LIFE overcomes the privacy, noise, and completeness limitations inherent in real-world tracking data, providing a robust and ethically sourced resource for research in transit optimization, human mobility analysis, and urban computing.

CLMay 31, 2021
A keyword-driven approach to science

Henrique Ferraz de Arruda, Luciano da Fontoura Costa

To a good extent, words can be understood as corresponding to patterns or categories that appeared in order to represent concepts and structures that are particularly important or useful in a given time and space. Words are characterized by not being completely general nor specific, in the sense that the same word can be instantiated or related to several different contexts, depending on specific situations. Indeed, the way in which words are instantiated and associated represents a particularly interesting aspect that can substantially help to better understand the context in which they are employed. Scientific words are no exception to that. In the present work, we approach the associations between a set of particularly relevant words in the sense of being not only frequently used in several areas, but also representing concepts that are currently related to some of the main standing challenges in science. More specifically, the study reported here takes into account the words "prediction", "model", "optimization", "complex", "entropy", "random", "deterministic", "pattern", and "database". In order to complement the analysis, we also obtain a network representing the relationship between the adopted areas. Many interesting results were found. First and foremost, several of the words were observed to have markedly distinct associations in different areas. Biology was found to be related to computer science, sharing associations with databases. Furthermore, for most of the cases, the words "complex", "model", and "prediction" were observed to have several strong associations.

SDOct 24, 2019
Syntonets: Toward A Harmony-Inspired General Model of Complex Networks

Luciano da Fontoura Costa, Henrique Ferraz de Arruda

We report an approach to obtaining complex networks with diverse topology, here called syntonets, taking into account the consonances and dissonances between notes as defined by scale temperaments. Though the fundamental frequency is usually considered, in real-world sounds several additional frequencies (partials) accompany the respective fundamental, influencing both timber and consonance between simultaneous notes. We use a method based on Helmholtz's consonance approach to quantify the consonances and dissonances between each of the pairs of notes in a given temperament. We adopt two distinct partials structures: (i) harmonic; and (ii) shifted, obtained by taking the harmonic components to a given power $β$, which is henceforth called the anharmonicity index. The latter type of sounds is more realistic in the sense that they reflect non-linearities implied by real-world instruments. When these consonances/dissonances are estimated along several octaves, respective syntonets can be obtained, in which nodes and weighted edge represent notes, and consonance/dissonance, respectively. The obtained results are organized into two main groups, those related to network science and musical theory. Regarding the former group, we have that the syntonets can provide, for varying values of $β$, a wide range of topologies spanning the space comprised between traditional models. Indeed, it is suggested here that syntony may provide a kind of universal complex network model. The musical interpretations of the results include the confirmation of the more regular consonance pattern of the equal temperament, obtained at the expense of a wider range of consonances such as that in the meantone temperament. We also have that scales derived for shifted partials tend to have a wider range of consonances/dissonances, depending on the temperament and anharmonicity strength.