COMSSEMar 1, 2016

RWebData: A High-Level Interface to the Programmable Web

arXiv:1603.00293v23 citations
AI Analysis

This tool reduces barriers for social science researchers with no web technology experience, but it is incremental as it builds on existing R and web data practices.

The authors tackled the challenge of high up-front costs for collecting and preparing data from the programmable web for statistical analysis in R, resulting in the RWebData package that automatically maps nested web data into a flat table-like format for direct use in R.

The rise of the programmable web offers new opportunities for the empirically driven social sciences. The access, compilation and preparation of data from the programmable web for statistical analysis can, however, involve substantial up-front costs for the practical researcher. The R-package RWebData provides a high-level framework that allows data to be easily collected from the programmable web in a format that can directly be used for statistical analysis in R (R Core Team 2013) without bothering about the data's initial format and nesting structure. It was developed specifically for users who have no experience with web technologies and merely use R as a statistical software. The core idea and methodological contribution of the package are the disentangling of parsing web data and mapping them with a generic algorithm (independent of the initial data structure) to a flat table-like representation. This paper provides an overview of the high-level functions for R-users, explains the basic architecture of the package, and illustrates the implemented data mapping algorithm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes