COCGDCDSSEDec 10, 2019

Optimizing and accelerating space-time Ripley's K function based on Apache Spark for distributed spatiotemporal point pattern analysis

arXiv:1912.04753v131 citations
Originality Incremental advance
AI Analysis

This work addresses scalability and performance issues for researchers analyzing large point-of-interest datasets in fields like ecology and urban transportation, though it is incremental as it extends existing parallel computing approaches to the spatiotemporal dimension.

The authors tackled the computational intensity of space-time Ripley's K function for spatiotemporal point pattern analysis by developing a distributed computing method using Apache Spark with four optimization strategies, achieving improved time efficiency as demonstrated in experiments.

With increasing point of interest (POI) datasets available with fine-grained spatial and temporal attributes, space-time Ripley's K function has been regarded as a powerful approach to analyze spatiotemporal point process. However, space-time Ripley's K function is computationally intensive for point-wise distance comparisons, edge correction and simulations for significance testing. Parallel computing technologies like OpenMP, MPI and CUDA have been leveraged to accelerate the K function, and related experiments have demonstrated the substantial acceleration. Nevertheless, previous works have not extended optimization of Ripley's K function from space dimension to space-time dimension. Without sophisticated spatiotemporal query and partitioning mechanisms, extra computational overhead can be problematic. Meanwhile, these researches were limited by the restricted scalability and relative expensive programming cost of parallel frameworks and impeded their applications for large POI dataset and Ripley's K function variations. This paper presents a distributed computing method to accelerate space-time Ripley's K function upon state-of-the-art distributed computing framework Apache Spark, and four strategies are adopted to simplify calculation procedures and accelerate distributed computing respectively. Based on the optimized method, a web-based visual analytics framework prototype has been developed. Experiments prove the feasibility and time efficiency of the proposed method, and also demonstrate its value on promoting applications of space-time Ripley's K function in ecology, geography, sociology, economics, urban transportation and other fields.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes